Open In Colab

In [1]:
#from google.colab import drive
#drive.mount('/content/drive')

Importing the basic necessary python libraries and modules

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display
pd.options.display.max_columns = None
pd.options.display.max_rows = None
In [3]:
# Reading the dataset

project_path = '/content/drive/My Drive/Colab/'
file_name ='input_data.xlsx'
In [4]:
df = pd.read_excel(r'C:\Users\Nishant\Downloads\input_data.xlsx', encoding='utf-8')
df.head()

#df=pd.read_excel(project_path+file_name)

#Displaying the top 10 records of the dataframe

df.head(10)
Out[4]:
Short description Description Caller Assignment group
0 login issue -verified user details.(employee# & manager na... spxjnwir pjlcoqds GRP_0
1 outlook \r\n\r\nreceived from: hmjdrvpb.komuaywn@gmail... hmjdrvpb komuaywn GRP_0
2 cant log in to vpn \r\n\r\nreceived from: eylqgodm.ybqkwiam@gmail... eylqgodm ybqkwiam GRP_0
3 unable to access hr_tool page unable to access hr_tool page xbkucsvz gcpydteq GRP_0
4 skype error skype error owlgqjme qhcozdfx GRP_0
5 unable to log in to engineering tool and skype unable to log in to engineering tool and skype eflahbxn ltdgrvkz GRP_0
6 event: critical:HostName_221.company.com the v... event: critical:HostName_221.company.com the v... jyoqwxhz clhxsoqy GRP_1
7 ticket_no1550391- employment status - new non-... ticket_no1550391- employment status - new non-... eqzibjhw ymebpoih GRP_0
8 unable to disable add ins on outlook unable to disable add ins on outlook mdbegvct dbvichlg GRP_0
9 ticket update on inplant_874773 ticket update on inplant_874773 fumkcsji sarmtlhy GRP_0
In [5]:
#Displaying the data type of each attribute

df.dtypes
Out[5]:
Short description    object
Description          object
Caller               object
Assignment group     object
dtype: object
In [6]:
#Displaying the shape of the dataframe

df.shape
Out[6]:
(8500, 4)

This shows that there are 8500 rows and 4 attributes in the dataframe.

In [7]:
# Displaying the information regarding the attributes

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8500 entries, 0 to 8499
Data columns (total 4 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   Short description  8492 non-null   object
 1   Description        8499 non-null   object
 2   Caller             8500 non-null   object
 3   Assignment group   8500 non-null   object
dtypes: object(4)
memory usage: 265.8+ KB

Since Caller column mainly contains the details of the user who raised the incident and is of not much use in our analysis and can be dropped. Assignment group is our predictor / target column with multiple classes. So, this is a Multiclass Classification problem..

In [8]:
# Dropping the caller column

df1 = df.drop('Caller',axis=1)

# Displaying the top 5 records of the dataframe after dropping the Caller column

df1.head()
Out[8]:
Short description Description Assignment group
0 login issue -verified user details.(employee# & manager na... GRP_0
1 outlook \r\n\r\nreceived from: hmjdrvpb.komuaywn@gmail... GRP_0
2 cant log in to vpn \r\n\r\nreceived from: eylqgodm.ybqkwiam@gmail... GRP_0
3 unable to access hr_tool page unable to access hr_tool page GRP_0
4 skype error skype error GRP_0
In [9]:
#Checking the null/missing values in the dataframe

df1.isnull().sum()
Out[9]:
Short description    8
Description          1
Assignment group     0
dtype: int64
In [10]:
!pip install missingno
Requirement already satisfied: missingno in c:\users\nishant\anaconda3\lib\site-packages (0.4.2)
Requirement already satisfied: numpy in c:\users\nishant\anaconda3\lib\site-packages (from missingno) (1.18.5)
Requirement already satisfied: seaborn in c:\users\nishant\anaconda3\lib\site-packages (from missingno) (0.10.1)
Requirement already satisfied: scipy in c:\users\nishant\anaconda3\lib\site-packages (from missingno) (1.4.1)
Requirement already satisfied: matplotlib in c:\users\nishant\anaconda3\lib\site-packages (from missingno) (3.2.2)
Requirement already satisfied: pandas>=0.22.0 in c:\users\nishant\anaconda3\lib\site-packages (from seaborn->missingno) (1.0.5)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in c:\users\nishant\anaconda3\lib\site-packages (from matplotlib->missingno) (2.4.7)
Requirement already satisfied: python-dateutil>=2.1 in c:\users\nishant\anaconda3\lib\site-packages (from matplotlib->missingno) (2.8.1)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\nishant\anaconda3\lib\site-packages (from matplotlib->missingno) (1.2.0)
Requirement already satisfied: cycler>=0.10 in c:\users\nishant\anaconda3\lib\site-packages (from matplotlib->missingno) (0.10.0)
Requirement already satisfied: pytz>=2017.2 in c:\users\nishant\anaconda3\lib\site-packages (from pandas>=0.22.0->seaborn->missingno) (2020.1)
Requirement already satisfied: six>=1.5 in c:\users\nishant\anaconda3\lib\site-packages (from python-dateutil>=2.1->matplotlib->missingno) (1.15.0)
In [11]:
import missingno as msno
In [12]:
# Visualizing the number of missing values as a bar chart  

msno.bar(df1)
Out[12]:
<matplotlib.axes._subplots.AxesSubplot at 0x24736d30c40>

Thus we can observe from here that their are total 8 NaN values in short description and 1 NaN value in Description attribute

In [13]:
#Displaying the records with NaN value in Short Description and Description column

df1[df1.isnull().any(axis=1)]
Out[13]:
Short description Description Assignment group
2604 NaN \r\n\r\nreceived from: ohdrnswl.rezuibdt@gmail... GRP_34
3383 NaN \r\n-connected to the user system using teamvi... GRP_0
3906 NaN -user unable tologin to vpn.\r\n-connected to... GRP_0
3910 NaN -user unable tologin to vpn.\r\n-connected to... GRP_0
3915 NaN -user unable tologin to vpn.\r\n-connected to... GRP_0
3921 NaN -user unable tologin to vpn.\r\n-connected to... GRP_0
3924 NaN name:wvqgbdhm fwchqjor\nlanguage:\nbrowser:mic... GRP_0
4341 NaN \r\n\r\nreceived from: eqmuniov.ehxkcbgj@gmail... GRP_0
4395 i am locked out of skype NaN GRP_0
In [14]:
# Filling the NaN value in Short Description and Description column by space(" ")

df1['Short description'].fillna(value=" ",inplace=True)
df1['Description'].fillna(value = " ",inplace=True)
In [15]:
#Again Checking the null/missing values in the dataframe

df1.isnull().sum()
Out[15]:
Short description    0
Description          0
Assignment group     0
dtype: int64
In [16]:
#Concatenate Short Description and Description columns in a new column "New Description"

df1['New Description'] = df1['Short description'] + ' ' +df1['Description']

#Displaying the first 5 records of the new concatenated dataframe

df1.head()
Out[16]:
Short description Description Assignment group New Description
0 login issue -verified user details.(employee# & manager na... GRP_0 login issue -verified user details.(employee# ...
1 outlook \r\n\r\nreceived from: hmjdrvpb.komuaywn@gmail... GRP_0 outlook \r\n\r\nreceived from: hmjdrvpb.komuay...
2 cant log in to vpn \r\n\r\nreceived from: eylqgodm.ybqkwiam@gmail... GRP_0 cant log in to vpn \r\n\r\nreceived from: eylq...
3 unable to access hr_tool page unable to access hr_tool page GRP_0 unable to access hr_tool page unable to access...
4 skype error skype error GRP_0 skype error skype error
In [17]:
# Displaying the Duplicate recors in the dataframe

df_copy = df1[['Short description', 'Description','Assignment group','New Description']].copy()
duplicateRowsDF = df_copy[df_copy.duplicated()]
duplicateRowsDF
Out[17]:
Short description Description Assignment group New Description
51 call for ecwtrjnq jpecxuty call for ecwtrjnq jpecxuty GRP_0 call for ecwtrjnq jpecxuty call for ecwtrjnq j...
81 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
123 unable to display expense report unable to display expense report GRP_0 unable to display expense report unable to dis...
157 ess password reset ess password reset GRP_0 ess password reset ess password reset
229 call for ecwtrjnq jpecxuty call for ecwtrjnq jpecxuty GRP_0 call for ecwtrjnq jpecxuty call for ecwtrjnq j...
235 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
242 windows password reset windows password reset GRP_0 windows password reset windows password reset
274 windows account locked windows account locked GRP_0 windows account locked windows account locked
301 windows password reset windows password reset GRP_0 windows password reset windows password reset
312 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
333 windows password reset windows password reset GRP_0 windows password reset windows password reset
380 unable to login to erp SID_34 unable to login to erp SID_34 GRP_0 unable to login to erp SID_34 unable to login ...
391 password reset request password reset request GRP_0 password reset request password reset request
393 password reset password reset GRP_0 password reset password reset
422 password reset password reset GRP_0 password reset password reset
493 ticket update on inplant_872730 ticket update on inplant_872730 GRP_0 ticket update on inplant_872730 ticket update ...
512 blank call //gso blank call //gso GRP_0 blank call //gso blank call //gso
516 outlook freezing issue outlook freezing issue GRP_0 outlook freezing issue outlook freezing issue
526 password reset password reset GRP_0 password reset password reset
544 unable to open outlook unable to open outlook GRP_0 unable to open outlook unable to open outlook
571 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
580 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
584 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
587 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
662 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
667 job bkbackup_tool_powder_prod_full failed in j... received from: monitoring_tool@company.com\r\n... GRP_8 job bkbackup_tool_powder_prod_full failed in j...
681 password reset password reset GRP_0 password reset password reset
713 skype error while logging in skype error while logging in GRP_0 skype error while logging in skype error whil...
720 blank call blank call GRP_0 blank call blank call
724 blank call blank call GRP_0 blank call blank call
741 account locked in erp SID_34 account locked in erp SID_34 GRP_0 account locked in erp SID_34 account locked in...
770 blank call blank call GRP_0 blank call blank call
850 password reset password reset GRP_0 password reset password reset
873 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
882 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
892 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
899 HostName_1030 is currently experiencing high c... HostName_1030 is currently experiencing high c... GRP_12 HostName_1030 is currently experiencing high c...
954 unable to access collaboration_platform unable to access collaboration_platform GRP_0 unable to access collaboration_platform unable...
973 unable to open outlook unable to open outlook GRP_0 unable to open outlook unable to open outlook
996 account unlock account unlock GRP_0 account unlock account unlock
1019 outlook not working outlook not working GRP_0 outlook not working outlook not working
1025 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
1064 job Job_1967d failed in job_scheduler at: 10/1... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_1967d failed in job_scheduler at: 10/1...
1125 blank call blank call GRP_0 blank call blank call
1126 account lockout account lockout GRP_0 account lockout account lockout
1129 password reset password reset GRP_0 password reset password reset
1131 unable to open outlook unable to open outlook GRP_0 unable to open outlook unable to open outlook
1139 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
1148 windows password reset windows password reset GRP_0 windows password reset windows password reset
1153 windows account lockout windows account lockout GRP_0 windows account lockout windows account lockout
1171 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
1216 needs to change password needs to change password GRP_0 needs to change password needs to change pass...
1217 windows account unlock windows account unlock GRP_0 windows account unlock windows account unlock
1253 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
1255 account locked. account locked. GRP_0 account locked. account locked.
1293 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
1354 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
1362 password reset request password reset request GRP_0 password reset request password reset request
1372 windows password reset windows password reset GRP_0 windows password reset windows password reset
1373 ess password reset ess password reset GRP_0 ess password reset ess password reset
1375 windows account lockout windows account lockout GRP_0 windows account lockout windows account lockout
1392 account lockout account lockout GRP_0 account lockout account lockout
1394 blank call blank call GRP_0 blank call blank call
1395 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
1421 password reset request. password reset request. GRP_0 password reset request. password reset request.
1449 account locked. account locked. GRP_0 account locked. account locked.
1458 windows account locked windows account locked GRP_0 windows account locked windows account locked
1480 password reset password reset GRP_0 password reset password reset
1502 windows account unlock windows account unlock GRP_0 windows account unlock windows account unlock
1505 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
1508 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
1525 account locked. account locked. GRP_0 account locked. account locked.
1528 unable to open outlook unable to open outlook GRP_0 unable to open outlook unable to open outlook
1571 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
1605 unable to login to collaboration_platform unable to login to collaboration_platform GRP_0 unable to login to collaboration_platform unab...
1618 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
1645 ess password reset ess password reset GRP_0 ess password reset ess password reset
1685 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
1691 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
1705 unable to log in to collaboration_platform unable to log in to collaboration_platform GRP_0 unable to log in to collaboration_platform un...
1728 account locked out account locked out GRP_0 account locked out account locked out
1744 phone issue phone issue GRP_0 phone issue phone issue
1745 windows password reset windows password reset GRP_0 windows password reset windows password reset
1802 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
1821 password reset from password_management_tool password reset from password_management_tool GRP_0 password reset from password_management_tool p...
1822 erp SID_37 password reset erp SID_37 password reset GRP_0 erp SID_37 password reset erp SID_37 password ...
1824 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
1827 ess password reset ess password reset GRP_0 ess password reset ess password reset
1831 password reset from password_management_tool password reset from password_management_tool GRP_0 password reset from password_management_tool p...
1844 unable to open a website unable to open a website GRP_0 unable to open a website unable to open a website
1851 reset passwords for fylrosuk kedgmiul using pa... the GRP_17 reset passwords for fylrosuk kedgmiul using pa...
1864 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
1884 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
1915 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
1978 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
1982 call came and got disconnected call came and got disconnected GRP_0 call came and got disconnected call came and g...
1985 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
2000 job Job_549 failed in job_scheduler at: 10/07/... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_549 failed in job_scheduler at: 10/07/...
2035 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
2038 account unlock account unlock GRP_0 account unlock account unlock
2043 unable to sign in to skype unable to sign in to skype GRP_0 unable to sign in to skype unable to sign in t...
2053 unable to access vpn unable to access vpn GRP_0 unable to access vpn unable to access vpn
2061 blank call // loud noise // gso blank call // loud noise // gso GRP_0 blank call // loud noise // gso blank call // ...
2062 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
2094 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
2103 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
2134 account locked account locked GRP_0 account locked account locked
2135 unable to log in to erp SID_34 unable to log in to erp SID_34 GRP_0 unable to log in to erp SID_34 unable to log i...
2141 blank call blank call GRP_0 blank call blank call
2196 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
2248 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
2264 unable to open outlook unable to open outlook GRP_0 unable to open outlook unable to open outlook
2275 windows password reset windows password reset GRP_0 windows password reset windows password reset
2278 account unlock account unlock GRP_0 account unlock account unlock
2300 windows password reset windows password reset GRP_0 windows password reset windows password reset
2310 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
2320 unable to login to engineering tool unable to login to engineering tool GRP_0 unable to login to engineering tool unable to...
2355 windows password reset windows password reset GRP_0 windows password reset windows password reset
2360 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password...
2372 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
2387 account got locked account got locked GRP_0 account got locked account got locked
2400 unable to sign in to skype unable to sign in to skype GRP_0 unable to sign in to skype unable to sign in t...
2418 windows password reset windows password reset GRP_0 windows password reset windows password reset
2424 account locked. account locked. GRP_0 account locked. account locked.
2437 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
2463 blank call blank call GRP_0 blank call blank call
2479 account unlock account unlock GRP_0 account unlock account unlock
2481 windows account unlock windows account unlock GRP_0 windows account unlock windows account unlock
2482 windows account lockout windows account lockout GRP_0 windows account lockout windows account lockout
2483 windows password reset windows password reset GRP_0 windows password reset windows password reset
2484 unable to sign in to skype unable to sign in to skype GRP_0 unable to sign in to skype unable to sign in t...
2494 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
2506 windows account lockout windows account lockout GRP_0 windows account lockout windows account lockout
2516 password reset from password_management_tool password reset from password_management_tool GRP_0 password reset from password_management_tool p...
2529 password reset password reset GRP_0 password reset password reset
2533 reset passwords for qwsjptlo hnlasbed using pa... the GRP_17 reset passwords for qwsjptlo hnlasbed using pa...
2554 reset passwords for bxeagsmt zrwdgsco using pa... the GRP_17 reset passwords for bxeagsmt zrwdgsco using pa...
2633 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
2642 password reset from password_management_tool password reset from password_management_tool GRP_0 password reset from password_management_tool p...
2658 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
2666 password reset request password reset request GRP_0 password reset request password reset request
2683 ticket update ticket update GRP_0 ticket update ticket update
2701 account got locked account got locked GRP_0 account got locked account got locked
2714 call for ecwtrjnq jpecxuty call for ecwtrjnq jpecxuty GRP_0 call for ecwtrjnq jpecxuty call for ecwtrjnq j...
2720 german call german call GRP_0 german call german call
2786 windows password reset windows password reset GRP_0 windows password reset windows password reset
2789 blank call blank call GRP_0 blank call blank call
2820 skype issue : personal certificate error skype issue : personal certificate error GRP_0 skype issue : personal certificate error skyp...
2840 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
2857 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
2870 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
2872 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
2875 blank call blank call GRP_0 blank call blank call
2876 blank call blank call GRP_0 blank call blank call
2898 unlock personal number in ess unlock personal number in ess GRP_0 unlock personal number in ess unlock personal ...
2906 windows password reset windows password reset GRP_0 windows password reset windows password reset
2927 windows account locked windows account locked GRP_0 windows account locked windows account locked
2943 account locked account locked GRP_0 account locked account locked
2956 outlook not working outlook not working GRP_0 outlook not working outlook not working
2999 unable to login to windows unable to login to windows GRP_0 unable to login to windows unable to login to...
3026 password reset password reset GRP_0 password reset password reset
3032 unable to load outlook unable to load outlook GRP_0 unable to load outlook unable to load outlook
3056 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
3059 account unlock account unlock GRP_0 account unlock account unlock
3065 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
3076 password is not getting synchronized. password is not getting synchronized. GRP_0 password is not getting synchronized. password...
3085 call for ecwtrjnq jpecxuty call for ecwtrjnq jpecxuty GRP_0 call for ecwtrjnq jpecxuty call for ecwtrjnq j...
3107 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
3111 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
3128 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
3129 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
3171 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password...
3172 password reset password reset GRP_0 password reset password reset
3219 call for ecwtrjnq jpecxuty call for ecwtrjnq jpecxuty GRP_0 call for ecwtrjnq jpecxuty call for ecwtrjnq j...
3221 password reset password reset GRP_0 password reset password reset
3225 windows password reset windows password reset GRP_0 windows password reset windows password reset
3299 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
3375 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
3396 account locked account locked GRP_0 account locked account locked
3398 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
3399 account locked. account locked. GRP_0 account locked. account locked.
3400 account locked. account locked. GRP_0 account locked. account locked.
3402 password reset password reset GRP_0 password reset password reset
3409 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
3411 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
3505 windows account locked windows account locked GRP_0 windows account locked windows account locked
3513 password reset password reset GRP_0 password reset password reset
3528 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
3550 unable to open outlook unable to open outlook GRP_0 unable to open outlook unable to open outlook
3584 outlook hangs. outlook hangs. GRP_0 outlook hangs. outlook hangs.
3618 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
3619 call came and got disconnected call came and got disconnected GRP_0 call came and got disconnected call came and g...
3621 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
3636 blank call blank call GRP_0 blank call blank call
3637 blank call blank call GRP_0 blank call blank call
3639 windows password reset windows password reset GRP_0 windows password reset windows password reset
3641 password reset password reset GRP_0 password reset password reset
3647 答复: 答复: order products online problem \r\n\r\nreceived from: fkdazsmi.yecbrofv@gmail... GRP_0 答复: 答复: order products online problem ...
3659 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
3661 unable to load outlook unable to load outlook GRP_0 unable to load outlook unable to load outlook
3671 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
3672 account locked. account locked. GRP_0 account locked. account locked.
3674 password reset password reset GRP_0 password reset password reset
3688 account locked in erp SID_34 account locked in erp SID_34 GRP_0 account locked in erp SID_34 account locked in...
3693 reset passwords for mvhcoqed konjdmwq using pa... the GRP_17 reset passwords for mvhcoqed konjdmwq using pa...
3697 ess password reset ess password reset GRP_0 ess password reset ess password reset
3724 windows account locked windows account locked GRP_0 windows account locked windows account locked
3725 windows account locked windows account locked GRP_0 windows account locked windows account locked
3731 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
3772 account unlock account unlock GRP_0 account unlock account unlock
3775 unable to access collaboration_platform unable to access collaboration_platform GRP_0 unable to access collaboration_platform unable...
3779 unable to access collaboration_platform unable to access collaboration_platform GRP_0 unable to access collaboration_platform unable...
3800 account locked account locked GRP_0 account locked account locked
3822 erp SID_34 password reset request erp SID_34 password reset request GRP_0 erp SID_34 password reset request erp SID_34 ...
3846 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password...
3863 windows account locked windows account locked GRP_0 windows account locked windows account locked
3865 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
3887 unable to login to engineering tool unable to login to engineering tool GRP_0 unable to login to engineering tool unable to...
3891 windows account locked windows account locked GRP_0 windows account locked windows account locked
3892 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
3905 vpn not working- vpn.company.com link is givi... vpn not working- vpn.company.com link is givi... GRP_0 vpn not working- vpn.company.com link is givi...
3908 vpn not working- vpn.company.com link is givi... vpn not working- vpn.company.com link is givi... GRP_0 vpn not working- vpn.company.com link is givi...
3909 vpn not working- vpn.company.com link is givi... vpn not working- vpn.company.com link is givi... GRP_0 vpn not working- vpn.company.com link is givi...
3910 -user unable tologin to vpn.\r\n-connected to... GRP_0 -user unable tologin to vpn.\r\n-connected ...
3913 vpn not working- vpn.company.com link is givi... vpn not working- vpn.company.com link is givi... GRP_0 vpn not working- vpn.company.com link is givi...
3914 vpn not working- vpn.company.com link is givi... vpn not working- vpn.company.com link is givi... GRP_0 vpn not working- vpn.company.com link is givi...
3915 -user unable tologin to vpn.\r\n-connected to... GRP_0 -user unable tologin to vpn.\r\n-connected ...
3918 vpn not working- vpn.company.com link is givi... vpn not working- vpn.company.com link is givi... GRP_0 vpn not working- vpn.company.com link is givi...
3921 -user unable tologin to vpn.\r\n-connected to... GRP_0 -user unable tologin to vpn.\r\n-connected ...
3925 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
3964 windows password reset windows password reset GRP_0 windows password reset windows password reset
3975 mobile device activation mobile device activation GRP_0 mobile device activation mobile device activation
3979 windows account lockout windows account lockout GRP_0 windows account lockout windows account lockout
3996 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
4036 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
4037 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
4038 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
4047 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
4060 unable to login to system unable to login to system GRP_0 unable to login to system unable to login to s...
4094 job Job_2883 failed in job_scheduler at: 09/18... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_2883 failed in job_scheduler at: 09/18...
4097 windows account locked windows account locked GRP_0 windows account locked windows account locked
4115 mobile device activation mobile device activation GRP_0 mobile device activation mobile device activa...
4118 password reset alert from o365 password reset alert from o365 GRP_0 password reset alert from o365 password reset ...
4133 unable to login to engineering tool unable to login to engineering tool GRP_0 unable to login to engineering tool unable to...
4137 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
4176 unable to install engineering_tool unable to install engineering_tool GRP_0 unable to install engineering_tool unable to i...
4177 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
4180 unlocked account unlocked account GRP_0 unlocked account unlocked account
4185 unable to connect to wifi unable to connect to wifi GRP_0 unable to connect to wifi unable to connect to...
4196 windows password reset windows password reset GRP_0 windows password reset windows password reset
4229 not able to access -inq industrial (-inq.indus... \r\n\r\nreceived from: muqdlobv.qflsdahg@gmail... GRP_0 not able to access -inq industrial (-inq.indus...
4251 password reset request password reset request GRP_0 password reset request password reset request
4254 password reset password reset GRP_0 password reset password reset
4257 password reset alert from o365 password reset alert from o365 GRP_0 password reset alert from o365 password reset ...
4273 blank call blank call GRP_0 blank call blank call
4292 outlook not launching outlook not launching GRP_0 outlook not launching outlook not launching
4303 call for ecwtrjnq jpecxuty call for ecwtrjnq jpecxuty GRP_0 call for ecwtrjnq jpecxuty call for ecwtrjnq j...
4327 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
4336 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
4361 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
4374 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
4377 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
4381 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
4387 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
4399 password reset from password_management_tool password reset from password_management_tool GRP_0 password reset from password_management_tool p...
4400 password reset password reset GRP_0 password reset password reset
4403 erp SID_39 password reset erp SID_39 password reset GRP_0 erp SID_39 password reset erp SID_39 password ...
4433 account unlock account unlock GRP_0 account unlock account unlock
4440 windows account lockout windows account lockout GRP_0 windows account lockout windows account lockout
4495 job SID_37hoti failed in job_scheduler at: 09/... received from: monitoring_tool@company.com\r\n... GRP_5 job SID_37hoti failed in job_scheduler at: 09/...
4513 mobile device activation mobile device activation GRP_0 mobile device activation mobile device activa...
4530 blank call blank call GRP_0 blank call blank call
4536 unable to log in to windows unable to log in to windows GRP_0 unable to log in to windows unable to log in ...
4544 outlook does not start. outlook does not start. GRP_0 outlook does not start. outlook does not start.
4550 call disconnected due to vpn disconnection call disconnected due to vpn disconnection GRP_0 call disconnected due to vpn disconnection cal...
4555 password expired password expired GRP_0 password expired password expired
4567 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
4613 windows password reset windows password reset GRP_0 windows password reset windows password reset
4630 password reset password reset GRP_0 password reset password reset
4634 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
4646 unable to connect to vpn unable to connect to vpn GRP_19 unable to connect to vpn unable to connect to vpn
4647 mobile device activation mobile device activation GRP_0 mobile device activation mobile device activa...
4655 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
4658 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
4673 outlook not launching outlook not launching GRP_0 outlook not launching outlook not launching
4675 windows password reset windows password reset GRP_0 windows password reset windows password reset
4687 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
4688 account locked account locked GRP_0 account locked account locked
4691 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
4704 private address fields are enabled on employee... disable private address fields, new & edit but... GRP_15 private address fields are enabled on employee...
4718 account locked. account locked. GRP_0 account locked. account locked.
4743 windows account locked windows account locked GRP_0 windows account locked windows account locked
4745 windows account locked windows account locked GRP_0 windows account locked windows account locked
4750 unable to login to erp SID_34 unable to login to erp SID_34 GRP_0 unable to login to erp SID_34 unable to login...
4766 unable to connect to outlook unable to connect to outlook GRP_0 unable to connect to outlook unable to connec...
4787 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
4796 account locked out account locked out GRP_0 account locked out account locked out
4839 unable to login to SID_1 unable to login to SID_1 GRP_0 unable to login to SID_1 unable to login to SID_1
4849 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
4856 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
4865 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
4881 install company barcode für ewew8323504 \vzqo... install company barcode für ewew8323504 \vzqo... GRP_24 install company barcode für ewew8323504 \vzqo...
4921 unable to log in to engineering tool unable to log in to engineering tool GRP_0 unable to log in to engineering tool unable to...
4928 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
4951 german call german call GRP_0 german call german call
4952 password reset password reset GRP_0 password reset password reset
4956 password reset password reset GRP_0 password reset password reset
4967 account locked out account locked out GRP_0 account locked out account locked out
4984 reset passwords for cubdsrml znewqgop using pa... the GRP_17 reset passwords for cubdsrml znewqgop using pa...
4991 reset passwords for davidthd robankm using pas... the GRP_17 reset passwords for davidthd robankm using pas...
5005 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
5026 account locked out account locked out GRP_0 account locked out account locked out
5029 account locked out account locked out GRP_0 account locked out account locked out
5031 ie issue ie issue GRP_0 ie issue ie issue
5034 unable to login to collaboration_platform unable to login to collaboration_platform GRP_0 unable to login to collaboration_platform unab...
5047 windows password reset windows password reset GRP_0 windows password reset windows password reset
5048 account unlock account unlock GRP_0 account unlock account unlock
5049 password reset password reset GRP_0 password reset password reset
5050 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
5060 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
5063 windows password reset windows password reset GRP_0 windows password reset windows password reset
5065 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
5083 windows password reset windows password reset GRP_0 windows password reset windows password reset
5097 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
5133 windows account locked windows account locked GRP_0 windows account locked windows account locked
5161 unable to log in to erp SID_34 unable to log in to erp SID_34 GRP_0 unable to log in to erp SID_34 unable to log i...
5164 unable to log in to supply_chain_software unable to log in to supply_chain_software GRP_0 unable to log in to supply_chain_software unab...
5181 account unlock account unlock GRP_0 account unlock account unlock
5193 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
5195 windows password reset windows password reset GRP_0 windows password reset windows password reset
5201 unable to load outlook unable to load outlook GRP_0 unable to load outlook unable to load outlook
5212 blank call blank call GRP_0 blank call blank call
5214 unable to log in to erp SID_34 unable to log in to erp SID_34 GRP_0 unable to log in to erp SID_34 unable to log ...
5221 account unlock account unlock GRP_0 account unlock account unlock
5222 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
5225 account locked. account locked. GRP_0 account locked. account locked.
5226 blank call blank call GRP_0 blank call blank call
5227 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
5243 windows account locked windows account locked GRP_0 windows account locked windows account locked
5261 account locked out and password reset account locked out and password reset GRP_0 account locked out and password reset account...
5271 windows account locked windows account locked GRP_0 windows account locked windows account locked
5273 windows account locked windows account locked GRP_0 windows account locked windows account locked
5274 account locked out and password reset account locked out and password reset GRP_0 account locked out and password reset account...
5312 erp SID_34 account locked. erp SID_34 account locked. GRP_0 erp SID_34 account locked. erp SID_34 account ...
5314 account locked. account locked. GRP_0 account locked. account locked.
5317 reset passwords for bxeagsmt zrwdgsco using pa... the GRP_17 reset passwords for bxeagsmt zrwdgsco using pa...
5391 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
5393 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
5400 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
5427 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
5442 skype audio not working skype audio not working GRP_0 skype audio not working skype audio not working
5446 ad account locked out ad account locked out GRP_0 ad account locked out ad account locked out
5486 account locked. account locked. GRP_0 account locked. account locked.
5488 job SID_38hotf failed in job_scheduler at: 09/... received from: monitoring_tool@company.com\r\n... GRP_8 job SID_38hotf failed in job_scheduler at: 09/...
5502 mapping printers mapping printers GRP_0 mapping printers mapping printers
5511 ess password reset ess password reset GRP_0 ess password reset ess password reset
5521 blank call //gso blank call //gso GRP_0 blank call //gso blank call //gso
5526 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
5534 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
5545 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
5546 wifi not working wifi not working GRP_0 wifi not working wifi not working
5549 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password...
5577 password reset request password reset request GRP_0 password reset request password reset request
5590 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
5616 unable to open outlook unable to open outlook GRP_0 unable to open outlook unable to open outlook
5620 need to configure printers need to configure printers GRP_0 need to configure printers need to configure p...
5646 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
5648 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
5676 skype : personal certificate issue skype : personal certificate issue GRP_0 skype : personal certificate issue skype : pe...
5685 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
5690 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
5694 mobile device activation mobile device activation GRP_0 mobile device activation mobile device activa...
5708 reset passwords for bxeagsmt zrwdgsco using pa... the GRP_17 reset passwords for bxeagsmt zrwdgsco using pa...
5715 windows account locked windows account locked GRP_0 windows account locked windows account locked
5741 windows account locked windows account locked GRP_0 windows account locked windows account locked
5743 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
5746 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
5775 account unlock account unlock GRP_0 account unlock account unlock
5782 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
5812 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
5817 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
5818 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
5824 erp SID_34 password reset done. confirmed with... erp SID_34 password reset done. confirmed with... GRP_0 erp SID_34 password reset done. confirmed with...
5844 printer driver update printer driver update GRP_0 printer driver update printer driver update
5859 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
5862 password reset request. password reset request. GRP_0 password reset request. password reset request.
5868 ad account locked out ad account locked out GRP_0 ad account locked out ad account locked out
5884 reset passwords for bxeagsmt zrwdgsco using pa... the GRP_17 reset passwords for bxeagsmt zrwdgsco using pa...
5886 windows account locked windows account locked GRP_0 windows account locked windows account locked
5900 account locked in erp SID_34 account locked in erp SID_34 GRP_0 account locked in erp SID_34 account locked in...
5906 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
5928 ticket update on inplant_855239 ticket update on inplant_855239 GRP_0 ticket update on inplant_855239 ticket update ...
5929 unable to connect to vpn unable to connect to vpn GRP_0 unable to connect to vpn unable to connect to vpn
5941 outlook not working outlook not working GRP_0 outlook not working outlook not working
5945 blank call //gso blank call //gso GRP_0 blank call //gso blank call //gso
5967 unable to log in to mii unable to log in to mii GRP_0 unable to log in to mii unable to log in to mii
5970 account unlock account unlock GRP_0 account unlock account unlock
5984 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
5989 account unlock account unlock GRP_0 account unlock account unlock
5992 unable to login to erp SID_34 unable to login to erp SID_34 GRP_0 unable to login to erp SID_34 unable to login ...
5995 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
5997 outlook hangs. outlook hangs. GRP_0 outlook hangs. outlook hangs.
6024 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
6035 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
6058 reset passwords for bxeagsmt zrwdgsco using pa... the GRP_17 reset passwords for bxeagsmt zrwdgsco using pa...
6070 account locked out and password reset account locked out and password reset GRP_0 account locked out and password reset account...
6085 account locked out and password reset account locked out and password reset GRP_0 account locked out and password reset account...
6095 windows account locked windows account locked GRP_0 windows account locked windows account locked
6130 job Job_749 failed in job_scheduler at: 08/27/... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_749 failed in job_scheduler at: 08/27/...
6141 job Job_1989 failed in job_scheduler at: 08/27... received from: monitoring_tool@company.com\r\n... GRP_6 job Job_1989 failed in job_scheduler at: 08/27...
6171 unable to submit expense report unable to submit expense report GRP_0 unable to submit expense report unable to subm...
6183 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
6201 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
6204 password expired password expired GRP_0 password expired password expired
6207 unable to change password on password_manageme... unable to change password on password_manageme... GRP_0 unable to change password on password_manageme...
6252 job Job_3028 failed in job_scheduler at: 08/26... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_3028 failed in job_scheduler at: 08/26...
6260 job Job_3028 failed in job_scheduler at: 08/25... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_3028 failed in job_scheduler at: 08/25...
6265 job pp_EU_tool_netch_ap1 failed in job_schedul... received from: monitoring_tool@company.com\r\n... GRP_8 job pp_EU_tool_netch_ap1 failed in job_schedul...
6270 unable to log in to erp SID_34 unable to log in to erp SID_34 GRP_0 unable to log in to erp SID_34 unable to log i...
6271 unable to log in to erp SID_34 unable to log in to erp SID_34 GRP_0 unable to log in to erp SID_34 unable to log i...
6299 account locked account locked GRP_0 account locked account locked
6313 password reset request. password reset request. GRP_0 password reset request. password reset request.
6321 job Job_1314 failed in job_scheduler at: 08/25... received from: monitoring_tool@company.com\r\n... GRP_60 job Job_1314 failed in job_scheduler at: 08/25...
6323 job Job_1314 failed in job_scheduler at: 08/25... received from: monitoring_tool@company.com\r\n... GRP_60 job Job_1314 failed in job_scheduler at: 08/25...
6337 cisco access point is not working. cisco access point is not working.\r\nmac addr... GRP_4 cisco access point is not working. cisco acces...
6340 probleme mit erpgui \vsdtxwry ngkcdjye probleme mit erpgui \vsdtxwry ngkcdjye GRP_24 probleme mit erpgui \vsdtxwry ngkcdjye problem...
6351 password reset alert from o365 password reset alert from o365 GRP_0 password reset alert from o365 password reset ...
6363 password reset alert from o365 password reset alert from o365 GRP_0 password reset alert from o365 password reset ...
6382 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
6392 ess password reset ess password reset GRP_0 ess password reset ess password reset
6411 svc-now ticket found... doing nothing received from: monitoring_tool@company.com\r\n... GRP_60 svc-now ticket found... doing nothing receiv...
6412 svc-now ticket found... doing nothing received from: monitoring_tool@company.com\r\n... GRP_60 svc-now ticket found... doing nothing receiv...
6437 outlook is continuously asking for password. outlook is continuously asking for password. GRP_0 outlook is continuously asking for password. o...
6440 account unlock account unlock GRP_0 account unlock account unlock
6449 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
6471 job SID_41arc2 failed in job_scheduler at: 08/... received from: monitoring_tool@company.com\r\n... GRP_8 job SID_41arc2 failed in job_scheduler at: 08/...
6485 job SID_31arc2 failed in job_scheduler at: 08/... received from: monitoring_tool@company.com\r\n... GRP_8 job SID_31arc2 failed in job_scheduler at: 08/...
6521 job Job_3028 failed in job_scheduler at: 08/24... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_3028 failed in job_scheduler at: 08/24...
6522 job Job_3028 failed in job_scheduler at: 08/24... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_3028 failed in job_scheduler at: 08/24...
6523 job Job_3028 failed in job_scheduler at: 08/24... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_3028 failed in job_scheduler at: 08/24...
6524 job Job_3028 failed in job_scheduler at: 08/24... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_3028 failed in job_scheduler at: 08/24...
6532 unable to login to erp SID_34 unable to login to erp SID_34 GRP_0 unable to login to erp SID_34 unable to login ...
6551 unable to load outlook unable to load outlook GRP_0 unable to load outlook unable to load outlook
6553 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
6563 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
6581 frequent account lockout frequent account lockout GRP_0 frequent account lockout frequent account lockout
6592 unable to log in to skype unable to log in to skype GRP_0 unable to log in to skype unable to log in to ...
6594 outlook not responding outlook not responding GRP_0 outlook not responding outlook not responding
6603 account unlock account unlock GRP_0 account unlock account unlock
6610 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
6625 account locked. account locked. GRP_0 account locked. account locked.
6639 erp SID_34 account locked. erp SID_34 account locked. GRP_0 erp SID_34 account locked. erp SID_34 account ...
6652 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
6655 windows account locked windows account locked GRP_0 windows account locked windows account locked
6659 job Job_3028 failed in job_scheduler at: 08/23... received from: monitoring_tool@company.com\r\n... GRP_8 job Job_3028 failed in job_scheduler at: 08/23...
6661 windows account locked windows account locked GRP_0 windows account locked windows account locked
6662 unable to login to engineering tool unable to login to engineering tool GRP_0 unable to login to engineering tool unable to ...
6663 unable to open outlook unable to open outlook GRP_0 unable to open outlook unable to open outlook
6664 unable to login to engineering tool unable to login to engineering tool GRP_0 unable to login to engineering tool unable to...
6679 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
6711 unable to login to erp SID_34 unable to login to erp SID_34 GRP_0 unable to login to erp SID_34 unable to login ...
6717 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
6719 unable to log in to erp SID_34 unable to log in to erp SID_34 GRP_0 unable to log in to erp SID_34 unable to log i...
6724 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
6729 windows account lockout windows account lockout GRP_0 windows account lockout windows account lockout
6736 ess password reset ess password reset GRP_0 ess password reset ess password reset
6738 account locked out account locked out GRP_0 account locked out account locked out
6739 blank call // gso blank call // gso GRP_0 blank call // gso blank call // gso
6740 unable to connect to wifi unable to connect to wifi GRP_0 unable to connect to wifi unable to connect to...
6771 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
6778 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
6801 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
6819 reset passwords for wvdxnkhf jirecvta using pa... the GRP_17 reset passwords for wvdxnkhf jirecvta using pa...
6841 windows account locked windows account locked GRP_0 windows account locked windows account locked
6850 windows account locked windows account locked GRP_0 windows account locked windows account locked
6856 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
6921 password reset password reset GRP_0 password reset password reset
6942 call came and got disconnected call came and got disconnected GRP_0 call came and got disconnected call came and g...
6950 erp SID_34 account locked. erp SID_34 account locked. GRP_0 erp SID_34 account locked. erp SID_34 account ...
6958 outlook not responding outlook not responding GRP_0 outlook not responding outlook not responding
6969 password reset request. password reset request. GRP_0 password reset request. password reset request.
6973 password reset alert from o365 password reset alert from o365 GRP_0 password reset alert from o365 password reset ...
6992 probleme mit erpgui \tmqfjard qzhgdoua probleme mit erpgui \tmqfjard qzhgdoua GRP_24 probleme mit erpgui \tmqfjard qzhgdoua problem...
7034 blank call blank call GRP_0 blank call blank call
7035 printer driver update printer driver update GRP_0 printer driver update printer driver update
7046 unable to login to collaboration_platform unable to login to collaboration_platform GRP_0 unable to login to collaboration_platform unab...
7069 windows password reset windows password reset GRP_0 windows password reset windows password reset
7075 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
7089 skype does not open. skype does not open. GRP_0 skype does not open. skype does not open.
7132 reset passwords for ezrsdgfc hofgvwel using pa... the GRP_17 reset passwords for ezrsdgfc hofgvwel using pa...
7136 account locked in erp SID_34 account locked in erp SID_34 GRP_0 account locked in erp SID_34 account locked in...
7138 unable to sign in to skype unable to sign in to skype GRP_0 unable to sign in to skype unable to sign in t...
7153 erp SID_34 password locked erp SID_34 password locked GRP_0 erp SID_34 password locked erp SID_34 password...
7155 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
7168 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
7170 account is locked account is locked GRP_0 account is locked account is locked
7201 erp SID_34 account unlock erp SID_34 account unlock GRP_0 erp SID_34 account unlock erp SID_34 account u...
7209 email queries email queries GRP_0 email queries email queries
7215 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
7222 account locked. account locked. GRP_0 account locked. account locked.
7226 erp is not working. error : log on balancing e... inc1542327 : cert opened. \r\nwork around GRP_0 erp is not working. error : log on balancing e...
7227 erp is not working. error : log on balancing e... inc1542327 : cert opened. \r\nwork around GRP_0 erp is not working. error : log on balancing e...
7276 mobile device activation. mobile device activation. GRP_0 mobile device activation. mobile device activa...
7323 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
7329 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
7344 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
7370 password reset password reset GRP_0 password reset password reset
7378 password reset password reset GRP_0 password reset password reset
7392 erp SID_37 password reset erp SID_37 password reset GRP_0 erp SID_37 password reset erp SID_37 password ...
7407 password reset password reset GRP_0 password reset password reset
7427 ad account lock out ad account lock out GRP_0 ad account lock out ad account lock out
7457 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
7458 windows account locked windows account locked GRP_0 windows account locked windows account locked
7459 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
7460 unable to login to erp SID_34 unable to login to erp SID_34 GRP_0 unable to login to erp SID_34 unable to login ...
7466 windows account locked windows account locked GRP_0 windows account locked windows account locked
7471 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
7472 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
7473 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
7477 help to change the windows password using pass... help to change the windows password using pass... GRP_0 help to change the windows password using pass...
7501 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
7518 windows password reset windows password reset GRP_0 windows password reset windows password reset
7521 unable to launch skype unable to launch skype GRP_0 unable to launch skype unable to launch skype
7523 unable to connect to company secure unable to connect to company secure GRP_0 unable to connect to company secure unable to ...
7528 unable to launch netweaver unable to launch netweaver GRP_0 unable to launch netweaver unable to launch ne...
7547 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
7636 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
7670 unable to login to collaboration_platform unable to login to collaboration_platform GRP_0 unable to login to collaboration_platform unab...
7686 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
7687 erp SID_34 account unlock and password reset erp SID_34 account unlock and password reset GRP_0 erp SID_34 account unlock and password reset e...
7690 erp SID_34 locked out. erp SID_34 locked out. GRP_0 erp SID_34 locked out. erp SID_34 locked out.
7692 unable to load outlook unable to load outlook GRP_0 unable to load outlook unable to load outlook
7694 account locked account locked GRP_0 account locked account locked
7735 account locked in ad account locked in ad GRP_0 account locked in ad account locked in ad
7737 not able to login to skype not able to login to skype GRP_0 not able to login to skype not able to login t...
7756 german call german call GRP_0 german call german call
7769 password reset password reset GRP_0 password reset password reset
7772 blank call // loud noise blank call // loud noise GRP_0 blank call // loud noise blank call // loud noise
7785 account locked out account locked out GRP_0 account locked out account locked out
7798 wifi not working wifi not working GRP_0 wifi not working wifi not working
7836 probleme mit erpgui \tmqfjard qzhgdoua probleme mit erpgui \tmqfjard qzhgdoua GRP_24 probleme mit erpgui \tmqfjard qzhgdoua problem...
7862 mobile device activation mobile device activation GRP_0 mobile device activation mobile device activation
7863 unable to install engineering_tool unable to install engineering_tool GRP_0 unable to install engineering_tool unable to i...
7872 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
7876 error login on to the SID_34 system. error login on to the SID_34 system.\r\n-verif... GRP_0 error login on to the SID_34 system. error log...
7880 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
7888 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
7890 windows password reset windows password reset GRP_0 windows password reset windows password reset
7894 password reset password reset GRP_0 password reset password reset
7905 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
7908 outlook is not opening. outlook is not opening. GRP_0 outlook is not opening. outlook is not opening.
7909 account locked. account locked. GRP_0 account locked. account locked.
7949 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account ...
7985 ess password reset ess password reset GRP_0 ess password reset ess password reset
8006 password reset password reset GRP_0 password reset password reset
8017 password reset password reset GRP_0 password reset password reset
8019 account locked out account locked out GRP_0 account locked out account locked out
8025 erp SID_34 account locked erp SID_34 account locked GRP_0 erp SID_34 account locked erp SID_34 account l...
8028 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
8031 account lockout account lockout GRP_0 account lockout account lockout
8051 issue on pricing in distributor_tool we have agreed price with many of the distribu... GRP_21 issue on pricing in distributor_tool we have a...
8052 outlook issue outlook issue GRP_0 outlook issue outlook issue
8054 erp SID_34 password reset. erp SID_34 password reset. GRP_0 erp SID_34 password reset. erp SID_34 password...
8077 erp SID_34 password reset request. erp SID_34 password reset request. GRP_0 erp SID_34 password reset request. erp SID_34 ...
8082 erp SID_34 password reset request erp SID_34 password reset request GRP_0 erp SID_34 password reset request erp SID_34 ...
8093 reset passwords for prgthyuulla ramdntythanjes... the GRP_17 reset passwords for prgthyuulla ramdntythanjes...
8102 unable to send or receive email unable to send or receive email GRP_0 unable to send or receive email unable to sen...
8109 windows account locked windows account locked GRP_0 windows account locked windows account locked
8117 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to s...
8132 windows password reset windows password reset GRP_0 windows password reset windows password reset
8150 windows password reset windows password reset GRP_0 windows password reset windows password reset
8175 password reset password reset GRP_0 password reset password reset
8184 erp SID_34 password reset erp SID_34 password reset GRP_0 erp SID_34 password reset erp SID_34 password ...
8187 unable to login to skype unable to login to skype GRP_0 unable to login to skype unable to login to skype
8215 password reset request. password reset request. GRP_0 password reset request. password reset request.
8259 account locked in erp SID_34 account locked in erp SID_34 GRP_0 account locked in erp SID_34 account locked in...
8265 unable to login to engineering tool unable to login to engineering tool GRP_0 unable to login to engineering tool unable to...
8267 windows account locked windows account locked GRP_0 windows account locked windows account locked
8268 windows account locked windows account locked GRP_0 windows account locked windows account locked
8272 login issue login issue\r\n-verified user details.(employe... GRP_0 login issue login issue\r\n-verified user deta...
8328 need password reset need password reset GRP_0 need password reset need password reset
8337 unable to connect to wireless unable to connect to wireless GRP_0 unable to connect to wireless unable to connec...
8347 blank call // loud noise blank call // loud noise GRP_0 blank call // loud noise blank call // loud noise
8363 unable to login to collaboration_platform unable to login to collaboration_platform GRP_0 unable to login to collaboration_platform unab...
8367 account locked account locked GRP_0 account locked account locked
8405 unable to launch outlook unable to launch outlook GRP_0 unable to launch outlook unable to launch outlook
8424 windows account lockout windows account lockout GRP_0 windows account lockout windows account lockout
8450 unable to connect to wifi unable to connect to wifi GRP_0 unable to connect to wifi unable to connect to...
8451 password reset erp SID_34 password reset erp SID_34 GRP_0 password reset erp SID_34 password reset erp S...
8458 windows account locked windows account locked GRP_0 windows account locked windows account locked
8489 account locked account locked GRP_0 account locked account locked
In [18]:
print("Total Number of duplicate records are:",duplicateRowsDF['Description'].count() )
Total Number of duplicate records are: 591
In [19]:
# Removing Duplicate records 

df_updt =df1.drop_duplicates(['Short description', 'Description', 'Assignment group','New Description'])

#Displaying the shape of the dataframe after removing the duplicate records

df_updt.shape
Out[19]:
(7909, 4)
In [20]:
# Checking number of unique groups in the dataframe

df_updt['Assignment group'].unique()
Out[20]:
array(['GRP_0', 'GRP_1', 'GRP_3', 'GRP_4', 'GRP_5', 'GRP_6', 'GRP_7',
       'GRP_8', 'GRP_9', 'GRP_10', 'GRP_11', 'GRP_12', 'GRP_13', 'GRP_14',
       'GRP_15', 'GRP_16', 'GRP_17', 'GRP_18', 'GRP_19', 'GRP_2',
       'GRP_20', 'GRP_21', 'GRP_22', 'GRP_23', 'GRP_24', 'GRP_25',
       'GRP_26', 'GRP_27', 'GRP_28', 'GRP_29', 'GRP_30', 'GRP_31',
       'GRP_33', 'GRP_34', 'GRP_35', 'GRP_36', 'GRP_37', 'GRP_38',
       'GRP_39', 'GRP_40', 'GRP_41', 'GRP_42', 'GRP_43', 'GRP_44',
       'GRP_45', 'GRP_46', 'GRP_47', 'GRP_48', 'GRP_49', 'GRP_50',
       'GRP_51', 'GRP_52', 'GRP_53', 'GRP_54', 'GRP_55', 'GRP_56',
       'GRP_57', 'GRP_58', 'GRP_59', 'GRP_60', 'GRP_61', 'GRP_32',
       'GRP_62', 'GRP_63', 'GRP_64', 'GRP_65', 'GRP_66', 'GRP_67',
       'GRP_68', 'GRP_69', 'GRP_70', 'GRP_71', 'GRP_72', 'GRP_73'],
      dtype=object)
In [21]:
#Displaying the basic description of Assignment group column

df_updt["Assignment group"].describe()
Out[21]:
count      7909
unique       74
top       GRP_0
freq       3429
Name: Assignment group, dtype: object

From this we can observe that their are total 74 unique target groups in our dataframe

In [22]:
# Displaying the total number of tickets falling under each Assignment group with their respective percentage 

df2 = df_updt['Assignment group'].value_counts().reset_index()
df2['Percentage'] = (df2['Assignment group']/df2['Assignment group'].sum())*100
df2.head(10)
Out[22]:
index Assignment group Percentage
0 GRP_0 3429 43.355671
1 GRP_8 645 8.155266
2 GRP_24 285 3.603490
3 GRP_12 256 3.236819
4 GRP_9 252 3.186244
5 GRP_2 241 3.047161
6 GRP_19 214 2.705778
7 GRP_3 200 2.528765
8 GRP_6 183 2.313820
9 GRP_13 145 1.833354
In [23]:
#Displaying the total number of tickets falling under each Assignment group with their respective percentage 

import seaborn as sns
import warnings

sns.set(style="whitegrid")
plt.figure(figsize=(20,5))
ax = sns.countplot(x="Assignment group", data=df_updt, order=df_updt["Assignment group"].value_counts().index)
ax.set_xticklabels(ax.get_xticklabels(), rotation=90)
for p in ax.patches:
 ax.annotate(str(format(p.get_height()/len(df_updt.index)*100, '.2f')+"%"), (p.get_x() + p.get_width() / 2., p.get_height()), ha = 'center', va = 'bottom', rotation=90, xytext = (0, 10), textcoords = 'offset points')

plt.tight_layout()
plt.show()

From this we can observe that Assignment group attribute is highly right skewed . Also, more than 40% of the data i.e, around 3429 records falls under Group 1 , thus implying that the dataframe is exteremly imbalance.

In [24]:
# Displaying the top 20 Assignment groups with highest number of tickets falling under them in ascending order 

df_top_20 = df_updt['Assignment group'].value_counts().nlargest(20).reset_index()
df_top_20
Out[24]:
index Assignment group
0 GRP_0 3429
1 GRP_8 645
2 GRP_24 285
3 GRP_12 256
4 GRP_9 252
5 GRP_2 241
6 GRP_19 214
7 GRP_3 200
8 GRP_6 183
9 GRP_13 145
10 GRP_10 140
11 GRP_5 128
12 GRP_14 118
13 GRP_25 116
14 GRP_33 107
15 GRP_4 99
16 GRP_29 97
17 GRP_18 88
18 GRP_16 85
19 GRP_31 69
In [25]:
# Visualizing the top 20 assignment groups 

plt.figure(figsize=(12,6))
bars = plt.bar(df_top_20['index'],df_top_20['Assignment group'])
plt.title('Top 20 Assignment groups with highest number of tickets falling under them')
plt.xlabel('Assignment Group')
plt.xticks(rotation=90)
plt.ylabel('Number of Tickets')

for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x(), yval + .005, yval)
plt.tight_layout()
plt.show()
In [26]:
# Displaying the bottom 20 Assignment groups with minimum number of tickets falling under them in descending order 

df_bottom_20 = df_updt['Assignment group'].value_counts().nsmallest(20).reset_index()
df_bottom_20
Out[26]:
index Assignment group
0 GRP_64 1
1 GRP_70 1
2 GRP_61 1
3 GRP_73 1
4 GRP_67 1
5 GRP_35 1
6 GRP_54 2
7 GRP_72 2
8 GRP_69 2
9 GRP_57 2
10 GRP_71 2
11 GRP_56 3
12 GRP_58 3
13 GRP_63 3
14 GRP_38 3
15 GRP_68 3
16 GRP_32 4
17 GRP_66 4
18 GRP_43 5
19 GRP_59 6
In [27]:
# Visualizing the bottom 20 assignment groups  

plt.figure(figsize=(12,6))
bars = plt.bar(df_bottom_20['index'],df_bottom_20['Assignment group'])
plt.title('Bottom 20 Assignment groups with minimum number of tickets falling under them')
plt.xlabel('Assignment Group')
plt.xticks(rotation=90)
plt.ylabel('Number of Tickets')
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x(), yval + .005, yval)
plt.tight_layout()
plt.show()
In [28]:
# Displaying a record before cleaning the garbled data

df_updt["New Description"][186]
Out[28]:
'é\x9d’岛兴å\x90ˆæœºç”µshipment notification邮箱设置 from:  \nsent: friday, october 28, 2016 7:20 am\nto: nwfodmhc exurcwkm\nsubject: re: é\x9d’岛兴å\x90ˆæœºç”µshipment notification邮箱设置\n\ndear,\npls help to update customer 4563729890 shipment notification email address :  abcdegy@gmail.com \n\n\nb. '
In [29]:
# Displaying another record before cleaning the garbled data 

df_updt["New Description"][281] 
Out[29]:
'unable to down load ethics module  from: brdhdd dhwduw\nsent: thursday, october 27, 2016 6:12 am\nto: nwfodmhc exurcwkm\nsubject::fwd: unable to down load ethics module \n\n\nbegin forwarded message:\nfrom:  <dqwhpjxy.pozjxbey@gmail.com>\nto:  <zanivrec.capbfhur@gmail.com>\nsubject: unable to down load ethics module \nhi  – trust doing well . i am unable to down load & getting below msg. i did reset resolution however still same issue persist.\n \nplease help.\n \n \n \n \n \ndirector of sales \ncompany indirect channels  - asia \n& \ndqwhpjxy.pozjxbey@gmail.com \n\n \n \n \n \n\n\n'
In [30]:
! pip install ftfy
Requirement already satisfied: ftfy in c:\users\nishant\anaconda3\lib\site-packages (5.8)
Requirement already satisfied: wcwidth in c:\users\nishant\anaconda3\lib\site-packages (from ftfy) (0.2.5)
In [31]:
from ftfy import *
In [32]:
val = df_updt.loc[: ,'New Description']
In [33]:
val.head()
Out[33]:
0    login issue -verified user details.(employee# ...
1    outlook \r\n\r\nreceived from: hmjdrvpb.komuay...
2    cant log in to vpn \r\n\r\nreceived from: eylq...
3    unable to access hr_tool page unable to access...
4                            skype error  skype error 
Name: New Description, dtype: object
In [34]:
df_updt['New Description'] = val.apply(fix_text)
<ipython-input-34-23ad1d1011f4>:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_updt['New Description'] = val.apply(fix_text)
In [35]:
#After using fify.fix_text on the given dataframe
df_updt["New Description"][186]
Out[35]:
'青岛兴合机电shipment notification邮箱设置 from:  \nsent: friday, october 28, 2016 7:20 am\nto: nwfodmhc exurcwkm\nsubject: re: 青岛兴合机电shipment notification邮箱设置\n\ndear,\npls help to update customer 4563729890 shipment notification email address :  abcdegy@gmail.com \n\n\nb. '
In [36]:
df_updt["New Description"][281]
Out[36]:
'unable to down load ethics module  from: brdhdd dhwduw\nsent: thursday, october 27, 2016 6:12 am\nto: nwfodmhc exurcwkm\nsubject::fwd: unable to down load ethics module \n\n\nbegin forwarded message:\nfrom:  <dqwhpjxy.pozjxbey@gmail.com>\nto:  <zanivrec.capbfhur@gmail.com>\nsubject: unable to down load ethics module \nhi  – trust doing well . i am unable to down load & getting below msg. i did reset resolution however still same issue persist.\n \nplease help.\n \n \n \n \n \ndirector of sales \ncompany indirect channels  - asia \n& \ndqwhpjxy.pozjxbey@gmail.com \n\n \n \n \n \n\n\n'
In [37]:
# Defining various function for cleaning the data 

import re  
#Remove numbers(integers)

def removeNumbers(text):
    text = ''.join([i for i in text if not i.isdigit()])         
    return text

#Replace Contractions

contraction_patterns = [ (r'won\'t', 'will not'),(r'didn\'t', 'did not'),(r'didnt', 'did not'), (r'can\'t', 'cannot'),(r'cant', 'cannot'), (r'i\'m', 'i am'), (r'ain\'t', 'is not'), (r'(\w+)\'ll', '\g<1> will'), (r'(\w+)n\'t', '\g<1> not'),
                         (r'(\w+)\'ve', '\g<1> have'), (r'(\w+)\'s', '\g<1> is'), (r'(\w+)\'re', '\g<1> are'), (r'(\w+)\'d', '\g<1> would'), (r'&', 'and'), (r'dammit', 'damn it'), (r'dont', 'do not'), (r'wont', 'will not') ]

def replaceContraction(text):
    patterns = [(re.compile(regex), repl) for (regex, repl) in contraction_patterns]
    for (pattern, repl) in patterns:
        (text, count) = re.subn(pattern, repl, text)
    return text


#Remove mail related words
       
def clean_data(text):
    text = re.sub(r"received from:",' ',text)
    text = re.sub(r"from:",' ',text)
    text = re.sub(r"to:",' ',text)
    text = re.sub(r"subject:",' ',text)
    text = re.sub(r"sent:",' ',text)
    text = re.sub(r"ic:",' ',text)
    text = re.sub(r"cc:",' ',text)
    text = re.sub(r"bcc:",' ',text)
    text = re.sub(r"hi",' ',text)
    text = re.sub(r"hello",' ',text)
    text = re.sub(r"com",' ',text)
    text = re.sub(r"gmail",' ',text)
    #Remove email 
    text = re.sub(r'\S*@\S*\s?', '', text)
    # Remove new line characters 
    text = re.sub(r'\n',' ',text)
    # Remove hashtag while keeping hashtag text
    text = re.sub(r'#','', text)
    #& 
    text = re.sub(r'&;?', 'and',text)
    # Remove HTML special entities (e.g. &amp;)
    text = re.sub(r'\&\w*;', '', text)
    # Remove hyperlinks
    text = re.sub(r'https?:\/\/.*\/\w*', '', text)  
    # Remove characters beyond Readable formart by Unicode:
    text= ''.join(c for c in text if c <= '\uFFFF') 
    text = text.strip()
    # Remove unreadable characters  (also extra spaces)
    text = ' '.join(re.sub("[^\u0030-\u0039\u0041-\u005a\u0061-\u007a]", " ", text).split())
    
    return text
  
In [38]:
def cleandata(text):
    # remove numbers
    text = removeNumbers(text)
    
    #remove punctuations
    text = re.sub(r"\W", " ", text, flags=re.I)
    text = text.replace('_',' ')
    
    #replace contractions
    text = replaceContraction(text)
      
    #convert to lower case
    text = text.lower()
    
    #remove mail related words
    text = clean_data(text)
            
    return text
In [39]:
# Applying the cleandata function to clean the dataframe

df_updt["New Description"] = val.apply(cleandata)
<ipython-input-39-fd9a973f75cf>:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_updt["New Description"] = val.apply(cleandata)
In [40]:
#Displaying a record after the cleaning the data

df_updt["New Description"][281]
Out[40]:
'unable to down load et cs module from brdhdd dhwduw sent thursday october am to nwfodmhc exurcwkm subject fwd unable to down load et cs module begin forwarded message from dqwhpjxy pozjxbey to zanivrec capbfhur subject unable to down load et cs module trust doing well i am unable to down load getting below msg i did reset resolution however still same issue persist please help director of sales pany indirect channels asia dqwhpjxy pozjxbey'
In [41]:
! pip install langdetect
Requirement already satisfied: langdetect in c:\users\nishant\anaconda3\lib\site-packages (1.0.8)
Requirement already satisfied: six in c:\users\nishant\anaconda3\lib\site-packages (from langdetect) (1.15.0)
In [42]:
# Defining  a function for the detection of the various languages in the dataframe

from langdetect import detect
    
def fn_lan_detect(df):                                        
   try:                                                          
      return detect(df)                                      
   except:                                                       
      return 'no'                                                  

df_updt['Language'] = val.apply(fn_lan_detect)
<ipython-input-42-659ded715740>:11: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_updt['Language'] = val.apply(fn_lan_detect)
In [43]:
# Displaying the languages present in the dataframe with their respective number of record in the dataframe

df_updt["Language"].value_counts()
Out[43]:
en    6689
de     376
af     178
no      99
it      95
fr      81
nl      71
sv      60
es      39
ca      38
pl      33
da      30
cy      17
pt      16
tl      16
ro      15
sl       9
sq       8
hr       7
et       7
id       5
fi       5
lt       3
sk       3
lv       3
cs       2
so       2
vi       1
hu       1
Name: Language, dtype: int64
In [44]:
# Visualizing  the languages present in the dataframe with their respective number of record in the dataframe

x = df_updt["Language"].value_counts()
x=x.sort_index()
plt.figure(figsize=(10,6))
ax= sns.barplot(x.index, x.values, alpha=0.8)
plt.title("Distribution of text by language")
plt.ylabel('Number of records')
plt.xlabel('Language')
rects = ax.patches
labels = x.values
for rect, label in zip(rects, labels):
    height = rect.get_height()
    ax.text(rect.get_x() + rect.get_width()/2, height + 5, label, ha='center', va='bottom')

plt.show()
In [45]:
import nltk
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Nishant\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Nishant\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\Nishant\AppData\Roaming\nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\Nishant\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
Out[45]:
True
In [46]:
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet, stopwords
stop = set(stopwords.words('english')) 
lemmatizer = WordNetLemmatizer()

# function to convert nltk tag to wordnet tag
def nltk_tag_to_wordnet_tag(nltk_tag):
    if nltk_tag.startswith('J'):
        return wordnet.ADJ
    elif nltk_tag.startswith('V'):
        return wordnet.VERB
    elif nltk_tag.startswith('N'):
        return wordnet.NOUN
    elif nltk_tag.startswith('R'):
        return wordnet.ADV
    else:          
        return None

def lemmatize_sentence(sentence):
    #tokenize the sentence and find the POS tag for each token
    nltk_tagged = nltk.pos_tag(nltk.word_tokenize(sentence))  
    #tuple of (token, wordnet_tag)
    wordnet_tagged = map(lambda x: (x[0], nltk_tag_to_wordnet_tag(x[1])), nltk_tagged)
    lemmatized_sentence = []
    for word, tag in wordnet_tagged:
        if tag is None:
            #if there is no available tag, append the token as is
            lemmatized_sentence.append(word)
        else:        
            #else use the tag to lemmatize the token
            lemmatized_sentence.append(lemmatizer.lemmatize(word, tag))
    return " ".join(lemmatized_sentence)
In [47]:
temp =[]
for sentence in df_updt["New Description"]:
    sentence = sentence.lower()
    l_sentence = lemmatize_sentence(sentence)
    words = [word for word in l_sentence.split() if word not in stopwords.words('english')]
    temp.append(words)
    
In [48]:
df_updt["Lemmatized clean"] = temp
<ipython-input-48-e811b5be8ed4>:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_updt["Lemmatized clean"] = temp
In [49]:
df_updt.to_excel("TempOutput_1.xlsx")
In [50]:
df_updt.head()
Out[50]:
Short description Description Assignment group New Description Language Lemmatized clean
0 login issue -verified user details.(employee# & manager na... GRP_0 login issue verified user details employee man... en [login, issue, verify, user, detail, employee,...
1 outlook \r\n\r\nreceived from: hmjdrvpb.komuaywn@gmail... GRP_0 outlook received from hmjdrvpb komuaywn team m... en [outlook, receive, hmjdrvpb, komuaywn, team, m...
2 cant log in to vpn \r\n\r\nreceived from: eylqgodm.ybqkwiam@gmail... GRP_0 cannot log in to vpn received from eylqgodm yb... en [log, vpn, receive, eylqgodm, ybqkwiam, log, v...
3 unable to access hr_tool page unable to access hr_tool page GRP_0 unable to access hr tool page unable to access... en [unable, access, hr, tool, page, unable, acces...
4 skype error skype error GRP_0 skype error skype error no [skype, error, skype, error]
In [51]:
df_updt.isnull().sum()
Out[51]:
Short description    0
Description          0
Assignment group     0
New Description      0
Language             0
Lemmatized clean     0
dtype: int64
In [52]:
from wordcloud import WordCloud
def wordcloud_grp(f, x):
    wordclouds_0=' '.join(map(str, f))

    wc = WordCloud(width=480, height=480, max_font_size=40, min_font_size=10, max_words=150).generate(wordclouds_0)
    plt.figure(figsize=(20,10))
    plt.imshow(wc, interpolation="bilinear")
    plt.axis("off")
    plt.title("Most common 50 words of {}".format(x))
    plt.margins(x=0, y=0)
    plt.show()
In [53]:
value = df_updt['Assignment group'].value_counts().sort_values(ascending=False).index
value
Out[53]:
Index(['GRP_0', 'GRP_8', 'GRP_24', 'GRP_12', 'GRP_9', 'GRP_2', 'GRP_19',
       'GRP_3', 'GRP_6', 'GRP_13', 'GRP_10', 'GRP_5', 'GRP_14', 'GRP_25',
       'GRP_33', 'GRP_4', 'GRP_29', 'GRP_18', 'GRP_16', 'GRP_31', 'GRP_7',
       'GRP_17', 'GRP_34', 'GRP_26', 'GRP_40', 'GRP_28', 'GRP_41', 'GRP_30',
       'GRP_15', 'GRP_42', 'GRP_20', 'GRP_45', 'GRP_22', 'GRP_1', 'GRP_11',
       'GRP_21', 'GRP_47', 'GRP_23', 'GRP_48', 'GRP_62', 'GRP_39', 'GRP_27',
       'GRP_37', 'GRP_60', 'GRP_44', 'GRP_36', 'GRP_50', 'GRP_53', 'GRP_65',
       'GRP_52', 'GRP_51', 'GRP_55', 'GRP_49', 'GRP_46', 'GRP_59', 'GRP_43',
       'GRP_32', 'GRP_66', 'GRP_56', 'GRP_58', 'GRP_63', 'GRP_38', 'GRP_68',
       'GRP_54', 'GRP_72', 'GRP_69', 'GRP_57', 'GRP_71', 'GRP_61', 'GRP_73',
       'GRP_67', 'GRP_70', 'GRP_64', 'GRP_35'],
      dtype='object')
In [54]:
for i in range(50):

    Grp = df_updt[df_updt['Assignment group'] == value[i]]
    Grp = Grp['Lemmatized clean']
    wordcloud_grp(Grp,value[i])
In [55]:
max_features = 10000
maxlen = 25
In [56]:
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

tokenizer = Tokenizer(num_words=max_features, split=' ')
tokenizer.fit_on_texts(df_updt["New Description"].values)
X = tokenizer.texts_to_sequences(df_updt["New Description"].values)
In [57]:
sequences = tokenizer.texts_to_sequences(df_updt["New Description"].values)
X = pad_sequences(sequences, maxlen = maxlen)

print(len(X))
7909
In [58]:
tokenizer.word_index
Out[58]:
{'to': 1,
 'the': 2,
 'in': 3,
 'job': 4,
 'from': 5,
 'is': 6,
 'no': 7,
 'not': 8,
 'on': 9,
 'pany': 10,
 'i': 11,
 'and': 12,
 'for': 13,
 'tool': 14,
 'a': 15,
 's': 16,
 'at': 17,
 'received': 18,
 'please': 19,
 't': 20,
 'yes': 21,
 'na': 22,
 'password': 23,
 'scheduler': 24,
 'erp': 25,
 'of': 26,
 'it': 27,
 'failed': 28,
 'sid': 29,
 'access': 30,
 'user': 31,
 'unable': 32,
 'issue': 33,
 'reset': 34,
 'ticket': 35,
 'my': 36,
 'with': 37,
 'you': 38,
 'have': 39,
 'error': 40,
 'hostname': 41,
 'monitoring': 42,
 'can': 43,
 'e': 44,
 'am': 45,
 'are': 46,
 'email': 47,
 'outlook': 48,
 'be': 49,
 'account': 50,
 'working': 51,
 'site': 52,
 'we': 53,
 'that': 54,
 'f': 55,
 'help': 56,
 'when': 57,
 'as': 58,
 'need': 59,
 'system': 60,
 'circuit': 61,
 'id': 62,
 'power': 63,
 'name': 64,
 'login': 65,
 'network': 66,
 'an': 67,
 'but': 68,
 'vendor': 69,
 'was': 70,
 'has': 71,
 'c': 72,
 'et': 73,
 'x': 74,
 'or': 75,
 'update': 76,
 'if': 77,
 'd': 78,
 'by': 79,
 'down': 80,
 'new': 81,
 'server': 82,
 'backup': 83,
 'image': 84,
 'w': 85,
 'me': 86,
 'what': 87,
 'usa': 88,
 'engineering': 89,
 'all': 90,
 'crm': 91,
 'out': 92,
 'message': 93,
 'tele': 94,
 'cid': 95,
 'phone': 96,
 'see': 97,
 'vpn': 98,
 'event': 99,
 'outage': 100,
 'able': 101,
 'below': 102,
 'printer': 103,
 'skype': 104,
 'type': 105,
 'will': 106,
 'cannot': 107,
 'log': 108,
 'do': 109,
 'inside': 110,
 'team': 111,
 'pm': 112,
 'plant': 113,
 'does': 114,
 'number': 115,
 'your': 116,
 'up': 117,
 'pc': 118,
 'device': 119,
 'request': 120,
 'customer': 121,
 'since': 122,
 'open': 123,
 'get': 124,
 'time': 125,
 'mit': 126,
 'check': 127,
 'windows': 128,
 'locked': 129,
 'problem': 130,
 'microsoft': 131,
 'best': 132,
 'contact': 133,
 'order': 134,
 'collaboration': 135,
 'platform': 136,
 'm': 137,
 'sent': 138,
 'change': 139,
 'data': 140,
 'ip': 141,
 'tcp': 142,
 'software': 143,
 'manager': 144,
 'connection': 145,
 'connect': 146,
 'b': 147,
 'r': 148,
 'any': 149,
 'summary': 150,
 'been': 151,
 'maintenance': 152,
 'information': 153,
 'scheduled': 154,
 'sales': 155,
 'mailto': 156,
 'laptop': 157,
 'management': 158,
 'nicht': 159,
 'abended': 160,
 'call': 161,
 'hr': 162,
 'provider': 163,
 'asa': 164,
 'using': 165,
 'cert': 166,
 'notified': 167,
 'maint': 168,
 'internet': 169,
 'into': 170,
 'there': 171,
 'group': 172,
 'ng': 173,
 'production': 174,
 'attached': 175,
 'getting': 176,
 'png': 177,
 'us': 178,
 'our': 179,
 'eu': 180,
 'work': 181,
 'other': 182,
 'puter': 183,
 'file': 184,
 'report': 185,
 'screen': 186,
 'subject': 187,
 'office': 188,
 'now': 189,
 'start': 190,
 'language': 191,
 'available': 192,
 'unlock': 193,
 'after': 194,
 'ch': 195,
 'p': 196,
 'same': 197,
 'following': 198,
 'did': 199,
 'print': 200,
 'global': 201,
 'could': 202,
 'delivery': 203,
 'germany': 204,
 'also': 205,
 'business': 206,
 'service': 207,
 'via': 208,
 'create': 209,
 'users': 210,
 'remote': 211,
 'he': 212,
 'jpg': 213,
 'address': 214,
 'space': 215,
 'inc': 216,
 'add': 217,
 'explorer': 218,
 'uacyltoe': 219,
 'active': 220,
 'one': 221,
 'ms': 222,
 'portal': 223,
 'issues': 224,
 'so': 225,
 'browser': 226,
 'dear': 227,
 'src': 228,
 'dst': 229,
 'application': 230,
 'mail': 231,
 'only': 232,
 'client': 233,
 'die': 234,
 'inwarehouse': 235,
 'source': 236,
 'und': 237,
 'install': 238,
 'bitte': 239,
 're': 240,
 'probleme': 241,
 'inplant': 242,
 'needs': 243,
 'telephone': 244,
 'use': 245,
 'deny': 246,
 'view': 247,
 'acl': 248,
 'gsc': 249,
 'hxgaycze': 250,
 'outside': 251,
 'interface': 252,
 'status': 253,
 'emails': 254,
 'some': 255,
 'details': 256,
 'link': 257,
 'started': 258,
 'sep': 259,
 'port': 260,
 'mm': 261,
 'trying': 262,
 'would': 263,
 'code': 264,
 'additional': 265,
 'blocked': 266,
 'passwords': 267,
 'back': 268,
 'know': 269,
 'der': 270,
 'mobile': 271,
 'location': 272,
 'verified': 273,
 'folder': 274,
 'reporting': 275,
 'employee': 276,
 'these': 277,
 'ap': 278,
 'g': 279,
 'set': 280,
 'top': 281,
 'let': 282,
 'tried': 283,
 'provide': 284,
 'telephony': 285,
 'sw': 286,
 'note': 287,
 'host': 288,
 'cc': 289,
 'should': 290,
 'o': 291,
 'ist': 292,
 'n': 293,
 'nwfodmhc': 294,
 'exurcwkm': 295,
 'setup': 296,
 'equipment': 297,
 'dial': 298,
 'agent': 299,
 'required': 300,
 'over': 301,
 'cs': 302,
 'they': 303,
 'changed': 304,
 'like': 305,
 'disk': 306,
 'specify': 307,
 'having': 308,
 'slow': 309,
 'drive': 310,
 'still': 311,
 'again': 312,
 'app': 313,
 'connected': 314,
 'through': 315,
 'may': 316,
 'diagnostics': 317,
 'how': 318,
 'verizon': 319,
 'destination': 320,
 'files': 321,
 'support': 322,
 'center': 323,
 'ad': 324,
 'security': 325,
 'priority': 326,
 'possible': 327,
 'urgent': 328,
 'used': 329,
 'gr': 330,
 'u': 331,
 'list': 332,
 'aug': 333,
 'then': 334,
 'kindly': 335,
 'http': 336,
 'today': 337,
 'per': 338,
 'internal': 339,
 'gh': 340,
 'full': 341,
 'showing': 342,
 'process': 343,
 'notification': 344,
 'le': 345,
 'monitor': 346,
 'de': 347,
 'more': 348,
 'printing': 349,
 'enter': 350,
 'had': 351,
 'date': 352,
 'ne': 353,
 'running': 354,
 'try': 355,
 'excel': 356,
 'mii': 357,
 'find': 358,
 'page': 359,
 'en': 360,
 'total': 361,
 'morning': 362,
 'expense': 363,
 'mac': 364,
 'due': 365,
 'action': 366,
 'september': 367,
 'found': 368,
 'created': 369,
 'resolve': 370,
 'material': 371,
 'da': 372,
 'k': 373,
 'apac': 374,
 'india': 375,
 'august': 376,
 'pls': 377,
 'services': 378,
 'missing': 379,
 'being': 380,
 'just': 381,
 'exe': 382,
 'wifi': 383,
 'von': 384,
 'defekt': 385,
 'orders': 386,
 'alerts': 387,
 'very': 388,
 'freundlichen': 389,
 'send': 390,
 'volume': 391,
 'document': 392,
 'count': 393,
 'got': 394,
 'hana': 395,
 'last': 396,
 'consumed': 397,
 'ticketing': 398,
 'auf': 399,
 'vip': 400,
 'good': 401,
 'attachment': 402,
 'agents': 403,
 'traffic': 404,
 'ess': 405,
 'run': 406,
 'october': 407,
 'content': 408,
 'domain': 409,
 'dell': 410,
 'label': 411,
 'iphone': 412,
 'hallo': 413,
 'plete': 414,
 'advise': 415,
 'look': 416,
 'events': 417,
 'problems': 418,
 'h': 419,
 'meeting': 420,
 'ich': 421,
 'l': 422,
 'logon': 423,
 'before': 424,
 'old': 425,
 'updated': 426,
 'installation': 427,
 'version': 428,
 'she': 429,
 'prod': 430,
 'wrong': 431,
 'audio': 432,
 'resolved': 433,
 'fix': 434,
 'receive': 435,
 'den': 436,
 'about': 437,
 'form': 438,
 'confirmed': 439,
 'net': 440,
 'query': 441,
 'her': 442,
 'kann': 443,
 'drucker': 444,
 'bei': 445,
 'correct': 446,
 'payroll': 447,
 'make': 448,
 'model': 449,
 'pleted': 450,
 'sign': 451,
 'click': 452,
 'pcap': 453,
 'tools': 454,
 'ie': 455,
 'external': 456,
 'driver': 457,
 'item': 458,
 'distributor': 459,
 'their': 460,
 'why': 461,
 'warning': 462,
 'mitteilung': 463,
 'warehouse': 464,
 'es': 465,
 'incident': 466,
 'related': 467,
 'local': 468,
 'installed': 469,
 'multiple': 470,
 'im': 471,
 'programdnty': 472,
 'submit': 473,
 'search': 474,
 'online': 475,
 'opening': 476,
 'where': 477,
 'diese': 478,
 'etc': 479,
 'were': 480,
 'hub': 481,
 'shows': 482,
 'too': 483,
 'purchasing': 484,
 'web': 485,
 'reports': 486,
 'two': 487,
 'automatically': 488,
 'accounts': 489,
 'po': 490,
 'times': 491,
 'df': 492,
 'supply': 493,
 'screenshot': 494,
 'because': 495,
 'switch': 496,
 'response': 497,
 'product': 498,
 'go': 499,
 'ing': 500,
 'show': 501,
 'display': 502,
 'ws': 503,
 'dn': 504,
 'sto': 505,
 'plm': 506,
 'mails': 507,
 'them': 508,
 'than': 509,
 'checked': 510,
 'rechner': 511,
 'ewew': 512,
 'kind': 513,
 'cold': 514,
 'select': 515,
 'field': 516,
 'line': 517,
 'pdf': 518,
 'day': 519,
 'approved': 520,
 'home': 521,
 'called': 522,
 'sir': 523,
 'want': 524,
 'exchange': 525,
 'receiving': 526,
 'admin': 527,
 'transaction': 528,
 'fe': 529,
 'questions': 530,
 'blank': 531,
 'finance': 532,
 'denied': 533,
 'certificate': 534,
 'fine': 535,
 'sie': 536,
 'default': 537,
 'off': 538,
 'aerp': 539,
 'calls': 540,
 'documents': 541,
 'ex': 542,
 'load': 543,
 'sync': 544,
 'funktioniert': 545,
 'save': 546,
 'arc': 547,
 'impact': 548,
 'review': 549,
 'incidents': 550,
 'das': 551,
 'z': 552,
 'assist': 553,
 'don': 554,
 'desktop': 555,
 'download': 556,
 'scan': 557,
 'many': 558,
 'applications': 559,
 'jul': 560,
 'another': 561,
 'wireless': 562,
 'activation': 563,
 'needed': 564,
 'added': 565,
 'stock': 566,
 'changes': 567,
 'detail': 568,
 'price': 569,
 'media': 570,
 'vid': 571,
 'says': 572,
 'th': 573,
 'doesn': 574,
 'during': 575,
 'lean': 576,
 'longer': 577,
 'days': 578,
 'netweaver': 579,
 'bex': 580,
 'analysis': 581,
 'fw': 582,
 'bkwin': 583,
 'attach': 584,
 'bobj': 585,
 'shot': 586,
 'its': 587,
 'few': 588,
 'every': 589,
 'friday': 590,
 'under': 591,
 'even': 592,
 'inbound': 593,
 'delete': 594,
 'south': 595,
 'handling': 596,
 'caller': 597,
 'personal': 598,
 'training': 599,
 'chain': 600,
 'mehr': 601,
 'tax': 602,
 'free': 603,
 'teamviewer': 604,
 'java': 605,
 'take': 606,
 'cost': 607,
 'responding': 608,
 'currently': 609,
 'assign': 610,
 'pp': 611,
 'next': 612,
 'monday': 613,
 'um': 614,
 'who': 615,
 'desk': 616,
 'explicit': 617,
 'correctly': 618,
 'further': 619,
 'already': 620,
 'hrp': 621,
 'interaction': 622,
 'co': 623,
 'oder': 624,
 'rule': 625,
 'loading': 626,
 'end': 627,
 'sich': 628,
 'netch': 629,
 'copy': 630,
 'dsw': 631,
 'systems': 632,
 'activity': 633,
 'processing': 634,
 'gb': 635,
 'website': 636,
 'hard': 637,
 'pl': 638,
 'dc': 639,
 'give': 640,
 'ic': 641,
 'going': 642,
 'servers': 643,
 'upgrade': 644,
 'sinkhole': 645,
 'expired': 646,
 'gso': 647,
 'several': 648,
 'both': 649,
 'well': 650,
 'options': 651,
 'guest': 652,
 'threshold': 653,
 'rtr': 654,
 'sure': 655,
 'outbound': 656,
 'refer': 657,
 'workflow': 658,
 'lock': 659,
 'post': 660,
 'person': 661,
 'alert': 662,
 'week': 663,
 'items': 664,
 'sartlgeo': 665,
 'yesterday': 666,
 'done': 667,
 'either': 668,
 'inspector': 669,
 'assigned': 670,
 'daily': 671,
 'incorrect': 672,
 'attendance': 673,
 'without': 674,
 'deleted': 675,
 'someone': 676,
 'updating': 677,
 'floor': 678,
 'win': 679,
 'ascii': 680,
 'hex': 681,
 've': 682,
 'www': 683,
 'remove': 684,
 'output': 685,
 'failure': 686,
 'approval': 687,
 'primary': 688,
 'here': 689,
 'unlocked': 690,
 'settings': 691,
 'pr': 692,
 'close': 693,
 'shared': 694,
 'database': 695,
 'zugriff': 696,
 'bk': 697,
 'ab': 698,
 'drawings': 699,
 'partner': 700,
 'bkbackup': 701,
 'lhqksbdx': 702,
 'located': 703,
 'shop': 704,
 'werden': 705,
 'requested': 706,
 'project': 707,
 'hotf': 708,
 'packet': 709,
 'duration': 710,
 'hp': 711,
 'current': 712,
 'first': 713,
 'description': 714,
 'seems': 715,
 'amerirtca': 716,
 'quote': 717,
 'organization': 718,
 'durch': 719,
 'each': 720,
 'ac': 721,
 'hours': 722,
 'works': 723,
 'org': 724,
 'until': 725,
 'engineer': 726,
 'munication': 727,
 'keep': 728,
 'calling': 729,
 'calendar': 730,
 'pcs': 731,
 'logging': 732,
 'tab': 733,
 'immediately': 734,
 'investigate': 735,
 'license': 736,
 'area': 737,
 'dev': 738,
 'condition': 739,
 'j': 740,
 'correlation': 741,
 'dynamics': 742,
 'different': 743,
 'directory': 744,
 'word': 745,
 'went': 746,
 'seeing': 747,
 'example': 748,
 'connecting': 749,
 'indicate': 750,
 'occurrence': 751,
 'average': 752,
 'etime': 753,
 'lan': 754,
 'stopped': 755,
 'transfer': 756,
 'fail': 757,
 'passwort': 758,
 'future': 759,
 'supervisor': 760,
 'nach': 761,
 'renew': 762,
 'month': 763,
 'above': 764,
 'restart': 765,
 'however': 766,
 'wel': 767,
 'reason': 768,
 'sincerely': 769,
 'authorized': 770,
 'soc': 771,
 'username': 772,
 'sound': 773,
 'st': 774,
 'allow': 775,
 'dd': 776,
 'pping': 777,
 'wit': 778,
 'teams': 779,
 'administrator': 780,
 'importance': 781,
 'customers': 782,
 'netbios': 783,
 'advised': 784,
 'value': 785,
 'db': 786,
 'sql': 787,
 'employees': 788,
 'once': 789,
 'est': 790,
 'v': 791,
 'rxoynvgi': 792,
 'ntgdsehl': 793,
 'snp': 794,
 'heu': 795,
 'regen': 796,
 'samples': 797,
 'israel': 798,
 'malware': 799,
 'flags': 800,
 'syn': 801,
 'regarding': 802,
 'option': 803,
 'disconnected': 804,
 'approve': 805,
 'node': 806,
 'enable': 807,
 'appears': 808,
 'ping': 809,
 'tuesday': 810,
 'bit': 811,
 'right': 812,
 'wednesday': 813,
 'entered': 814,
 'relay': 815,
 'escalation': 816,
 'result': 817,
 'ef': 818,
 'errors': 819,
 'card': 820,
 'soon': 821,
 'configuration': 822,
 'dp': 823,
 'attachments': 824,
 'batch': 825,
 'launch': 826,
 'ee': 827,
 'dhcpd': 828,
 'dhcpack': 829,
 'eth': 830,
 'lease': 831,
 'udp': 832,
 'confirm': 833,
 'case': 834,
 'provided': 835,
 'danke': 836,
 'financial': 837,
 'handle': 838,
 'technical': 839,
 'gmbh': 840,
 'android': 841,
 'cvss': 842,
 'protocol': 843,
 'infected': 844,
 'purposes': 845,
 'critical': 846,
 'keeps': 847,
 'shown': 848,
 'needful': 849,
 'mentioned': 850,
 'disabled': 851,
 'box': 852,
 'tracker': 853,
 'read': 854,
 'distribution': 855,
 'asking': 856,
 'reference': 857,
 'spam': 858,
 'rakthyesh': 859,
 'corresponding': 860,
 'devices': 861,
 'share': 862,
 'method': 863,
 'reply': 864,
 'text': 865,
 'between': 866,
 'balancing': 867,
 'effective': 868,
 'pricing': 869,
 'wu': 870,
 'ok': 871,
 'departments': 872,
 'solve': 873,
 'keine': 874,
 'reboot': 875,
 'recently': 876,
 'mailbox': 877,
 'drivers': 878,
 'deployment': 879,
 'lhqsm': 880,
 'icmp': 881,
 'lost': 882,
 'qa': 883,
 'profile': 884,
 'cell': 885,
 'sr': 886,
 'discount': 887,
 'fy': 888,
 'filesys': 889,
 'facing': 890,
 'wenn': 891,
 'accept': 892,
 'generating': 893,
 'jobs': 894,
 'window': 895,
 'thursday': 896,
 'won': 897,
 'dmvpn': 898,
 'minutes': 899,
 'station': 900,
 'valid': 901,
 'german': 902,
 'changing': 903,
 'battery': 904,
 'vkzwafuh': 905,
 'tcjnuswg': 906,
 'firewall': 907,
 'anymore': 908,
 'release': 909,
 'fixed': 910,
 'alternate': 911,
 'manually': 912,
 'sending': 913,
 'owned': 914,
 'scanner': 915,
 'em': 916,
 'os': 917,
 'hxgayczeing': 918,
 'credentials': 919,
 'ein': 920,
 'control': 921,
 'bw': 922,
 'boot': 923,
 'symantec': 924,
 'forward': 925,
 'somet': 926,
 'ce': 927,
 'latitude': 928,
 'vlan': 929,
 'concerns': 930,
 'php': 931,
 'delegating': 932,
 'medium': 933,
 'session': 934,
 'button': 935,
 'printed': 936,
 'drawing': 937,
 'point': 938,
 'hq': 939,
 'billing': 940,
 'ea': 941,
 'long': 942,
 'dns': 943,
 'mpls': 944,
 'must': 945,
 'describe': 946,
 'turn': 947,
 'dat': 948,
 'rpc': 949,
 'directionality': 950,
 'scwx': 951,
 'sherlock': 952,
 'sle': 953,
 'instances': 954,
 'upload': 955,
 'tablet': 956,
 'made': 957,
 'wy': 958,
 'disclaimer': 959,
 'past': 960,
 'records': 961,
 'temporarily': 962,
 'size': 963,
 'tag': 964,
 'moved': 965,
 'url': 966,
 'dwfiykeo': 967,
 'argtxmvcumar': 968,
 'q': 969,
 'rth': 970,
 'english': 971,
 'key': 972,
 'er': 973,
 'creating': 974,
 'wird': 975,
 'room': 976,
 'directly': 977,
 'sind': 978,
 'performance': 979,
 'hear': 980,
 'asset': 981,
 'utc': 982,
 'score': 983,
 'differently': 984,
 'escalating': 985,
 'meetings': 986,
 'non': 987,
 'part': 988,
 'basis': 989,
 'null': 990,
 'starting': 991,
 'msd': 992,
 'detected': 993,
 'mfg': 994,
 'rerouted': 995,
 'zu': 996,
 'function': 997,
 'jionmpsf': 998,
 'wnkpzcmv': 999,
 'affected': 1000,
 ...}

Since the data distribution is imbalanced, use techniques like under sampling, over sampling to balance the data before feeding it to the models.

Steps: 1) Split the data into train & test sets. 2) Perform different techniques to balance the data. 3) Train the model. 4) Test the model using the test set.

`# This is formatted as code`
In [59]:
df_updt.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 7909 entries, 0 to 8499
Data columns (total 6 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   Short description  7909 non-null   object
 1   Description        7909 non-null   object
 2   Assignment group   7909 non-null   object
 3   New Description    7909 non-null   object
 4   Language           7909 non-null   object
 5   Lemmatized clean   7909 non-null   object
dtypes: object(6)
memory usage: 752.5+ KB
In [60]:
df_to_process = df_updt.copy()
In [61]:
df_to_process.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 7909 entries, 0 to 8499
Data columns (total 6 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   Short description  7909 non-null   object
 1   Description        7909 non-null   object
 2   Assignment group   7909 non-null   object
 3   New Description    7909 non-null   object
 4   Language           7909 non-null   object
 5   Lemmatized clean   7909 non-null   object
dtypes: object(6)
memory usage: 432.5+ KB
In [62]:
df_to_process.drop(["Short description", "Description", "Language"],axis=1,inplace=True)
In [63]:
df_to_process.head(5)
Out[63]:
Assignment group New Description Lemmatized clean
0 GRP_0 login issue verified user details employee man... [login, issue, verify, user, detail, employee,...
1 GRP_0 outlook received from hmjdrvpb komuaywn team m... [outlook, receive, hmjdrvpb, komuaywn, team, m...
2 GRP_0 cannot log in to vpn received from eylqgodm yb... [log, vpn, receive, eylqgodm, ybqkwiam, log, v...
3 GRP_0 unable to access hr tool page unable to access... [unable, access, hr, tool, page, unable, acces...
4 GRP_0 skype error skype error [skype, error, skype, error]
In [64]:
def wordTokenizer(dataframe):
    tokenizer = Tokenizer(num_words=numWords,filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',lower=True,split=' ', char_level=False)
    tokenizer.fit_on_texts(dataframe)
    dataframe = tokenizer.texts_to_sequences(dataframe)
    return tokenizer,dataframe
In [65]:
df_to_process.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 7909 entries, 0 to 8499
Data columns (total 3 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   Assignment group  7909 non-null   object
 1   New Description   7909 non-null   object
 2   Lemmatized clean  7909 non-null   object
dtypes: object(3)
memory usage: 247.2+ KB
In [66]:
from sklearn import preprocessing 
  
label_encoder = preprocessing.LabelEncoder() 
  
df_to_process['Assignment group ID']= label_encoder.fit_transform(df_to_process['Assignment group']) 
df_to_process['Assignment group ID'].unique()
Out[66]:
array([ 0,  1, 23, 34, 45, 56, 67, 72, 73,  2,  3,  4,  5,  6,  7,  8,  9,
       10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 24, 25, 27, 28,
       29, 30, 31, 32, 33, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 46, 47,
       48, 49, 50, 51, 52, 53, 54, 55, 57, 58, 26, 59, 60, 61, 62, 63, 64,
       65, 66, 68, 69, 70, 71])
In [67]:
maxlen = 150
numWords = 9000
tokenizer,X = wordTokenizer(df_to_process['New Description'])
y = np.asarray(df_to_process['Assignment group'])
X = pad_sequences(X, maxlen = maxlen)
In [68]:
df_to_process["Assignment group"].unique()
Out[68]:
array(['GRP_0', 'GRP_1', 'GRP_3', 'GRP_4', 'GRP_5', 'GRP_6', 'GRP_7',
       'GRP_8', 'GRP_9', 'GRP_10', 'GRP_11', 'GRP_12', 'GRP_13', 'GRP_14',
       'GRP_15', 'GRP_16', 'GRP_17', 'GRP_18', 'GRP_19', 'GRP_2',
       'GRP_20', 'GRP_21', 'GRP_22', 'GRP_23', 'GRP_24', 'GRP_25',
       'GRP_26', 'GRP_27', 'GRP_28', 'GRP_29', 'GRP_30', 'GRP_31',
       'GRP_33', 'GRP_34', 'GRP_35', 'GRP_36', 'GRP_37', 'GRP_38',
       'GRP_39', 'GRP_40', 'GRP_41', 'GRP_42', 'GRP_43', 'GRP_44',
       'GRP_45', 'GRP_46', 'GRP_47', 'GRP_48', 'GRP_49', 'GRP_50',
       'GRP_51', 'GRP_52', 'GRP_53', 'GRP_54', 'GRP_55', 'GRP_56',
       'GRP_57', 'GRP_58', 'GRP_59', 'GRP_60', 'GRP_61', 'GRP_32',
       'GRP_62', 'GRP_63', 'GRP_64', 'GRP_65', 'GRP_66', 'GRP_67',
       'GRP_68', 'GRP_69', 'GRP_70', 'GRP_71', 'GRP_72', 'GRP_73'],
      dtype=object)
In [69]:
maxlen = 150
numWords = 9000
tokenizer,X = wordTokenizer(df_to_process['New Description'])
y = np.asarray(df_to_process['Assignment group'])
X = pad_sequences(X, maxlen = maxlen)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)
In [70]:
# import the train test split package from scikit learn
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=10)
In [71]:
from sklearn import metrics
from sklearn.ensemble import RandomForestClassifier

rf=RandomForestClassifier()
rf.fit(X_train,y_train)
y_pred=rf.predict(X_test)
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
Accuracy: 0.5722713864306784

GRU

In [72]:
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from tensorflow.keras.layers import Dense, Input, LSTM, Embedding, Dropout, Activation, Flatten, Bidirectional, GlobalMaxPool1D,GRU,Conv1D,MaxPooling1D
from tensorflow.keras.models import Model, Sequential
import tensorflow as tf
from sklearn import metrics
from tensorflow.keras import backend as K
import matplotlib.pyplot as plt
from tensorflow.keras.utils import plot_model
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
In [73]:
class GruGloveModel:
  model= Model()
  X_test=[]
  y_test=[]
  embedding_matrix=[]

  def wordTokenizer(self, dataframe):
    tokenizer = Tokenizer(num_words=numWords,filters='!"#$%&()*+,-./:;<=>?@[\\]^_`{|}~\t\n',lower=True,split=' ', char_level=False)
    tokenizer.fit_on_texts(dataframe)
    dataframe = tokenizer.texts_to_sequences(dataframe)
    return tokenizer,dataframe
  
  def splitData(self,X,y):

    print("Number of Samples:", len(X))
    print("Number of Labels: ", len(y))
    X_train, self.X_test, y_train, self.y_test = train_test_split(X, y, test_size=0.2, random_state=10) # changed by Abraham
    X_train, X_Val, y_train, y_Val = train_test_split(X, y, test_size=0.2, random_state=10)
    print("Number of train Samples:", len(X_train))
    print("Number of val Samples:", len(X_Val))

    return X_train, self.X_test, y_train, self.y_test, X_Val, y_Val

  def tokenizeAndEmbedding(self,dataframe):
    
    tokenizer,X = self.wordTokenizer(dataframe['New Description'])
    y = np.asarray(dataframe['Assignment group ID'])
    X = pad_sequences(X, maxlen = maxlen)

    self.embedding_matrix = np.zeros((numWords+1, 100))
    for i,word in tokenizer.index_word.items():
      if i<numWords+1:
        embedding_vector = embeddings_index.get(word)
        if embedding_vector is not None:
            self.embedding_matrix[i] = embedding_vector
    return X,y

  def train(self, dataframe, batch_size, epochs):
   
    X,y = self.tokenizeAndEmbedding(dataframe)
    X_train, _, y_train, _, X_Val, y_Val = self.splitData(X,y)
    model_history = self.fitModel(X_train,y_train,X_Val,y_Val,batch_size, epochs)
    return model_history

  def fitModel(self,X_train,y_train,X_Val,y_Val,batch_size, epochs):
    
    input_layer = Input(shape=(maxlen,),dtype=tf.int64)
    embed = Embedding(numWords+1,output_dim=100,input_length=maxlen,weights=[self.embedding_matrix], trainable=True)(input_layer)  #weights=[embedding_matrix]
    gru=GRU(128)(embed)
    drop=Dropout(0.3)(gru)
    dense =Dense(100,activation='relu')(drop)
    out=Dense(len((pd.Series(y_train)).unique())+1,activation='softmax')(dense)   

    self.model = Model(input_layer,out)
    self.model.compile(loss='sparse_categorical_crossentropy',optimizer="adam",metrics=['accuracy'])

    # self.model.summary()
    # plot_model(self.model,to_file="GRU.jpg")

    checkpoint = ModelCheckpoint('model-{epoch:03d}-{val_accuracy:03f}.h5', verbose=1, monitor='val_accuracy',save_best_only=True, mode='auto') 
    reduceLoss = ReduceLROnPlateau(monitor='val_loss', factor=0.2,patience=2, min_lr=0.0001)
    model_history = self.model.fit(X_train,y_train,batch_size=batch_size, epochs=epochs, callbacks=[checkpoint,reduceLoss], validation_data=(X_Val,y_Val))
    return model_history,self.model

  def prediction(self):
      
     pred = self.model.predict(self.X_test)
     pred = [i.argmax() for i in pred]
     accuracy=metrics.accuracy_score(self.y_test, pred)
     print("Accuracy of the model :",accuracy)
     return accuracy

  def plotModelAccuracy(self, history, modelname):
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])

    plt.title(modelname+' model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train','test'], loc='upper left')
    plt.show()

    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])

    plt.title(modelname+' model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train','test'], loc='upper left')
    plt.show()

  def plotModel(self):
    self.model.summary()
In [74]:
from gensim.models import Word2Vec

sentences = [line.split(' ') for line in df_to_process['New Description']]
word2vec = Word2Vec(sentences=sentences,min_count=1)
word2vec.wv.save_word2vec_format(r'C:\Users\Nishant\Desktop\word2vec_vector.txt')
In [75]:
# load the whole embedding into memory
embeddings_index = dict()
f = open(r'C:\Users\Nishant\Desktop\word2vec_vector.txt')

for line in f:
	values = line.split()
	word = values[0]
	coefs = np.asarray(values[1:], dtype='float32')
	embeddings_index[word] = coefs
f.close()
print('Loaded %s word vectors.' % len(embeddings_index))
Loaded 15390 word vectors.
In [76]:
# Check how the GRU Model perform with the cleansed data
# Check how the LSTM Model perform with the cleansed data
epochs=12
gruModelRawData = GruGloveModel()
gruModelRawData_history, model = gruModelRawData.train(df_to_process,100,epochs)
gruRaw_accuracy = gruModelRawData.prediction()
Number of Samples: 7909
Number of Labels:  7909
Number of train Samples: 6327
Number of val Samples: 1582
Epoch 1/15
64/64 [==============================] - ETA: 0s - loss: 2.5812 - accuracy: 0.4787
Epoch 00001: val_accuracy improved from -inf to 0.52276, saving model to model-001-0.522756.h5
64/64 [==============================] - 24s 380ms/step - loss: 2.5812 - accuracy: 0.4787 - val_loss: 2.1533 - val_accuracy: 0.5228
Epoch 2/15
64/64 [==============================] - ETA: 0s - loss: 2.0375 - accuracy: 0.5347
Epoch 00002: val_accuracy improved from 0.52276 to 0.54298, saving model to model-002-0.542984.h5
64/64 [==============================] - 21s 329ms/step - loss: 2.0375 - accuracy: 0.5347 - val_loss: 1.9676 - val_accuracy: 0.5430
Epoch 3/15
64/64 [==============================] - ETA: 0s - loss: 1.8448 - accuracy: 0.5578
Epoch 00003: val_accuracy improved from 0.54298 to 0.56637, saving model to model-003-0.566372.h5
64/64 [==============================] - 21s 325ms/step - loss: 1.8448 - accuracy: 0.5578 - val_loss: 1.8493 - val_accuracy: 0.5664
Epoch 4/15
64/64 [==============================] - ETA: 0s - loss: 1.6806 - accuracy: 0.5791
Epoch 00004: val_accuracy improved from 0.56637 to 0.57965, saving model to model-004-0.579646.h5
64/64 [==============================] - 21s 322ms/step - loss: 1.6806 - accuracy: 0.5791 - val_loss: 1.7114 - val_accuracy: 0.5796
Epoch 5/15
64/64 [==============================] - ETA: 0s - loss: 1.5331 - accuracy: 0.5933
Epoch 00005: val_accuracy did not improve from 0.57965
64/64 [==============================] - 21s 322ms/step - loss: 1.5331 - accuracy: 0.5933 - val_loss: 1.6784 - val_accuracy: 0.5740
Epoch 6/15
64/64 [==============================] - ETA: 0s - loss: 1.3553 - accuracy: 0.6238
Epoch 00006: val_accuracy improved from 0.57965 to 0.59292, saving model to model-006-0.592920.h5
64/64 [==============================] - 19s 297ms/step - loss: 1.3553 - accuracy: 0.6238 - val_loss: 1.6618 - val_accuracy: 0.5929
Epoch 7/15
64/64 [==============================] - ETA: 0s - loss: 1.2088 - accuracy: 0.6558
Epoch 00007: val_accuracy did not improve from 0.59292
64/64 [==============================] - 21s 321ms/step - loss: 1.2088 - accuracy: 0.6558 - val_loss: 1.6504 - val_accuracy: 0.5796
Epoch 8/15
64/64 [==============================] - ETA: 0s - loss: 1.0746 - accuracy: 0.6902
Epoch 00008: val_accuracy improved from 0.59292 to 0.59482, saving model to model-008-0.594817.h5
64/64 [==============================] - 20s 318ms/step - loss: 1.0746 - accuracy: 0.6902 - val_loss: 1.6766 - val_accuracy: 0.5948
Epoch 9/15
64/64 [==============================] - ETA: 0s - loss: 0.9511 - accuracy: 0.7275
Epoch 00009: val_accuracy improved from 0.59482 to 0.61125, saving model to model-009-0.611252.h5
64/64 [==============================] - 20s 308ms/step - loss: 0.9511 - accuracy: 0.7275 - val_loss: 1.8025 - val_accuracy: 0.6113
Epoch 10/15
64/64 [==============================] - ETA: 0s - loss: 0.8160 - accuracy: 0.7662
Epoch 00010: val_accuracy did not improve from 0.61125
64/64 [==============================] - 20s 317ms/step - loss: 0.8160 - accuracy: 0.7662 - val_loss: 1.7694 - val_accuracy: 0.6075
Epoch 11/15
64/64 [==============================] - ETA: 0s - loss: 0.7714 - accuracy: 0.7746
Epoch 00011: val_accuracy did not improve from 0.61125
64/64 [==============================] - 21s 330ms/step - loss: 0.7714 - accuracy: 0.7746 - val_loss: 1.8168 - val_accuracy: 0.6018
Epoch 12/15
64/64 [==============================] - ETA: 0s - loss: 0.7353 - accuracy: 0.7906
Epoch 00012: val_accuracy did not improve from 0.61125
64/64 [==============================] - 20s 319ms/step - loss: 0.7353 - accuracy: 0.7906 - val_loss: 1.8328 - val_accuracy: 0.6087
Epoch 13/15
64/64 [==============================] - ETA: 0s - loss: 0.7209 - accuracy: 0.7881
Epoch 00013: val_accuracy did not improve from 0.61125
64/64 [==============================] - 20s 313ms/step - loss: 0.7209 - accuracy: 0.7881 - val_loss: 1.8392 - val_accuracy: 0.6062
Epoch 14/15
64/64 [==============================] - ETA: 0s - loss: 0.7119 - accuracy: 0.7941
Epoch 00014: val_accuracy did not improve from 0.61125
64/64 [==============================] - 20s 318ms/step - loss: 0.7119 - accuracy: 0.7941 - val_loss: 1.8745 - val_accuracy: 0.6113
Epoch 15/15
64/64 [==============================] - ETA: 0s - loss: 0.6943 - accuracy: 0.7936
Epoch 00015: val_accuracy did not improve from 0.61125
64/64 [==============================] - 20s 312ms/step - loss: 0.6943 - accuracy: 0.7936 - val_loss: 1.8847 - val_accuracy: 0.6049
Accuracy of the model : 0.6049304677623262
In [77]:
gruModelRawData.plotModel()
Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 150)]             0         
_________________________________________________________________
embedding (Embedding)        (None, 150, 100)          900100    
_________________________________________________________________
gru (GRU)                    (None, 128)               88320     
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense (Dense)                (None, 100)               12900     
_________________________________________________________________
dense_1 (Dense)              (None, 74)                7474      
=================================================================
Total params: 1,008,794
Trainable params: 1,008,794
Non-trainable params: 0
_________________________________________________________________
In [78]:
gruModelRawData.plotModelAccuracy(gruModelRawData_history, 'All Data Unsampled GRU')
In [81]:
from sklearn.utils import resample
maxOthers = 1000

df_updt_resampled = df_to_process[0:0]
for grp in df_to_process['Assignment group'].unique():
    df_updt_GrpDF = df_to_process[df_to_process['Assignment group'] == grp]
    resampled = resample(df_updt_GrpDF, replace=True, n_samples=int(maxOthers), random_state=123)
    df_updt_resampled = df_updt_resampled.append(resampled)

descending_order = df_updt_resampled['Assignment group'].value_counts().sort_values(ascending=False).index
plt.subplots(figsize=(22,5))
#add code to rotate the labels
ax=sns.countplot(x='Assignment group', data=df_updt_resampled, color='royalblue')
ax.set_xticklabels(ax.get_xticklabels(), rotation=45, ha="right")
plt.tight_layout()
plt.show()
In [82]:
# Check how the GRU Model perform with all the data which is cleansed & resampled to 661 to make the target balance
gruModelAllDataResampled = GruGloveModel()
gruModelAllDataResampled_history, model = gruModelAllDataResampled.train(df_updt_resampled,100,epochs)
gruResampled_accuracy = gruModelAllDataResampled.prediction()
Number of Samples: 74000
Number of Labels:  74000
Number of train Samples: 59200
Number of val Samples: 14800
Epoch 1/15
592/592 [==============================] - ETA: 0s - loss: 1.8039 - accuracy: 0.5575
Epoch 00001: val_accuracy improved from -inf to 0.84912, saving model to model-001-0.849122.h5
592/592 [==============================] - 191s 323ms/step - loss: 1.8039 - accuracy: 0.5575 - val_loss: 0.5230 - val_accuracy: 0.8491
Epoch 2/15
592/592 [==============================] - ETA: 0s - loss: 0.3491 - accuracy: 0.8987
Epoch 00002: val_accuracy improved from 0.84912 to 0.92419, saving model to model-002-0.924189.h5
592/592 [==============================] - 181s 305ms/step - loss: 0.3491 - accuracy: 0.8987 - val_loss: 0.2526 - val_accuracy: 0.9242
Epoch 3/15
592/592 [==============================] - ETA: 0s - loss: 0.2012 - accuracy: 0.9365
Epoch 00003: val_accuracy improved from 0.92419 to 0.93703, saving model to model-003-0.937027.h5
592/592 [==============================] - 161s 272ms/step - loss: 0.2012 - accuracy: 0.9365 - val_loss: 0.1942 - val_accuracy: 0.9370
Epoch 4/15
592/592 [==============================] - ETA: 0s - loss: 0.1679 - accuracy: 0.9443
Epoch 00004: val_accuracy improved from 0.93703 to 0.94189, saving model to model-004-0.941892.h5
592/592 [==============================] - 169s 286ms/step - loss: 0.1679 - accuracy: 0.9443 - val_loss: 0.1721 - val_accuracy: 0.9419
Epoch 5/15
592/592 [==============================] - ETA: 0s - loss: 0.1490 - accuracy: 0.9482
Epoch 00005: val_accuracy improved from 0.94189 to 0.94723, saving model to model-005-0.947230.h5
592/592 [==============================] - 178s 300ms/step - loss: 0.1490 - accuracy: 0.9482 - val_loss: 0.1648 - val_accuracy: 0.9472
Epoch 6/15
592/592 [==============================] - ETA: 0s - loss: 0.1403 - accuracy: 0.9510
Epoch 00006: val_accuracy improved from 0.94723 to 0.94730, saving model to model-006-0.947297.h5
592/592 [==============================] - 164s 276ms/step - loss: 0.1403 - accuracy: 0.9510 - val_loss: 0.1584 - val_accuracy: 0.9473
Epoch 7/15
592/592 [==============================] - ETA: 0s - loss: 0.1329 - accuracy: 0.9526
Epoch 00007: val_accuracy did not improve from 0.94730
592/592 [==============================] - 163s 275ms/step - loss: 0.1329 - accuracy: 0.9526 - val_loss: 0.1640 - val_accuracy: 0.9448
Epoch 8/15
592/592 [==============================] - ETA: 0s - loss: 0.1311 - accuracy: 0.9539
Epoch 00008: val_accuracy did not improve from 0.94730
592/592 [==============================] - 165s 279ms/step - loss: 0.1311 - accuracy: 0.9539 - val_loss: 0.1566 - val_accuracy: 0.9470
Epoch 9/15
592/592 [==============================] - ETA: 0s - loss: 0.1291 - accuracy: 0.9533
Epoch 00009: val_accuracy did not improve from 0.94730
592/592 [==============================] - 178s 301ms/step - loss: 0.1291 - accuracy: 0.9533 - val_loss: 0.1567 - val_accuracy: 0.9468
Epoch 10/15
592/592 [==============================] - ETA: 0s - loss: 0.1239 - accuracy: 0.9551
Epoch 00010: val_accuracy did not improve from 0.94730
592/592 [==============================] - 177s 298ms/step - loss: 0.1239 - accuracy: 0.9551 - val_loss: 0.1600 - val_accuracy: 0.9468
Epoch 11/15
592/592 [==============================] - ETA: 0s - loss: 0.1143 - accuracy: 0.9577
Epoch 00011: val_accuracy improved from 0.94730 to 0.95020, saving model to model-011-0.950203.h5
592/592 [==============================] - 160s 271ms/step - loss: 0.1143 - accuracy: 0.9577 - val_loss: 0.1534 - val_accuracy: 0.9502
Epoch 12/15
592/592 [==============================] - ETA: 0s - loss: 0.1127 - accuracy: 0.9581
Epoch 00012: val_accuracy did not improve from 0.95020
592/592 [==============================] - 144s 243ms/step - loss: 0.1127 - accuracy: 0.9581 - val_loss: 0.1530 - val_accuracy: 0.9489
Epoch 13/15
592/592 [==============================] - ETA: 0s - loss: 0.1121 - accuracy: 0.9581
Epoch 00013: val_accuracy did not improve from 0.95020
592/592 [==============================] - 153s 259ms/step - loss: 0.1121 - accuracy: 0.9581 - val_loss: 0.1526 - val_accuracy: 0.9495
Epoch 14/15
592/592 [==============================] - ETA: 0s - loss: 0.1126 - accuracy: 0.9577
Epoch 00014: val_accuracy improved from 0.95020 to 0.95088, saving model to model-014-0.950878.h5
592/592 [==============================] - 163s 276ms/step - loss: 0.1126 - accuracy: 0.9577 - val_loss: 0.1548 - val_accuracy: 0.9509
Epoch 15/15
592/592 [==============================] - ETA: 0s - loss: 0.1119 - accuracy: 0.9579
Epoch 00015: val_accuracy did not improve from 0.95088
592/592 [==============================] - 182s 307ms/step - loss: 0.1119 - accuracy: 0.9579 - val_loss: 0.1554 - val_accuracy: 0.9493
Accuracy of the model : 0.9492567567567568
In [83]:
gruModelAllDataResampled.plotModelAccuracy(gruModelAllDataResampled_history, 'All Data Resampled GRU')
In [ ]: